phState
/ |
/ | ...
V V
dg1 pl1
| |
| |
V V
VE_dg1 VE_pl1
where VE_<F> is the virtual evidence given by MLP activations for
feature <F>. Dg1, pl1, etc. are deterministic given the phone state.
This is the "natural" extension of hybrid models to the AF world.
Svitchboard, monophone, hybrid
This system uses the 8 ANNs to provide virtual evidence about the 8 features. The 8 feature hidden RVs each depend on the phone state using a
DenseCPT which has only one non-zero entry per row - it is deterministic
| Results for ANN outputs divided by the prior, using word alignments |
| Vocab size | Task | Word error rate (%) | VE scale factors | language model | Notes |
| | 1 | Validation | Test | dg1 | pl1 | scale | penalty | |
| 10 | 35.9 | 41.0 | 0.5 | 1.5 | 26 | -12 | Searched range of dg1,pl1 weights: 0.5, 1.0, 1.5 |
| 500 | 78.5 (1) | | 1.5 | 1.5 | 26 | -4 | No weight tuning. dg1=pl1=1.5 |
(1) validation done only on first 100 sentences of D_short
Validation means the D_short set, unless noted.
Conclusion
The deterministic mapping is much worse than the learned one for 10 words. Less clear for 500 words.
--
SimonKing - 04 Aug 2006