Skip to content.

-- KarenLivescu - 15 Dec 2005

WS06 > ResultsPage > HybridMonophoneNondetMapping
     phState
     /   | 
    /    |  ...
   V     V
  dg1   pl1
   |     |
   |     |
   V     V
 VE_dg1 VE_pl1

where VE_<F> is the virtual evidence given by MLP activations 
for feature <F>.  

Svitchboard, monophone, hybrid

This system uses the 8 ANNs to provide virtual evidence about the 8 features. The 8 feature hidden RVs each depend on the phone state using a DenseCPT. If this has only one non-zero entry per row, is is deterministic.

Models trained using original word alignments
Vocab size Task Word error rate (%) VE scale factors language model Divide Det. Notes
    Validation Test dg1 pl1 scale penalty by prior? CPTs?  
10 1 26.0   1.5 1.5 25 -2 no no (A)
32.0   1.5 1.5 27 -4 yes (B)
32.9   0.5 0.5 28 -5 no yes (A)
            yes  

(A) weight search over all combinations of 0.5/1.0/1.5 for dg1 and pl1 (B) No weight search (yet)

The table below is WRONG

Results for ANN outputs NOT divided by the prior, and without using word alignments
Vocab size Task Word error rate (%) VE scale factors language model Notes
    Validation Test dg1 pl1 scale penalty  
10 1 33.7   1.0 1.0 20 -3 full D set
32.5   0.5 1.0 20 -2
29.0 35.1 0.5 1.5 22 -2 searching over 0.1,0.5,1,1.5,2,4,8,16 for each of dg1 and pl1 scale factors
500 84.6??   0.5 1.5 20 -1 ckbeam 10000, NOT TUNED recipe 1
Results for ANN outputs divided by the prior, using word alignments
Vocab size Task Word error rate (%) VE scale factors language model Notes
  1 Validation Test dg1 pl1 scale penalty  
10 24.6 29.2 1.5 1.5 24 -4 Searched wide range of dg1,pl1 weights
500 74.7 (1) * 1.5 1.5 22/24 -2 No weight search, recipe 2, trained to 0.5 tolerance, decode ckbeam 25000
74.7 (1) 78.0 1.5 1.5 22/24 -2 Weight search (0.5/1.0/1.5 for dg1 and pl1), recipe 2, 0.2 tol, decode ckbeam 25000
500 (1)   1.5 1.5 ?? ?? No weight search, recipe 3, trained to 0.5 tolerance, decode ckbeam 25000
Validation means the D_short set, unless noted.

(1) Validation on only the first 100 utterances of D_short

-- SimonKing - 25 Jul 2006

Recipes for the 500 word task

Very slow to train starting with uniform DCPTs (unless I can find a better triangulation), so:

Recipe 1

Train on 1000 utterances for 2 iterations

Take the DCPTs and make them more sparse by zeroing all entries less than 0.1

Using these parameters, run the genetic triangulation script to find a fast triangulation, given this particular sparsity of the DCPTs.

Starting from these parameters, train to 0.5% tolerance (takes 8 its) on full training set

Find a decoding graph triangulation using the final trained parameters.

-- SimonKing - 01 Aug 2006

Recipe 2

Found a better triangulation using the genetic algorithm. Then manually re-retriangulated the epilogue and prologue (becasue they were "completed") using heurstic "S".

This model is easily trainable with fully dense CPTs.

However, decoding takes serious memory (ckbeam of 25000 because anything smaller lead to different decodings on one test sentence), although is fast enough (~20 secs per utt).

To make this decode in reasonable amounts of memory, all state_to_FEAT DCPTs were made sparser by zeroing all entries smaller than 0.1

-- SimonKing - 04 Aug 2006

Recipe 3

As recipe two, but zeroing entries smaller than 0.? (TO DO)

-- SimonKing - 08 Aug 2006