Skip to content.

-- KarenLivescu - 15 Dec 2005

WS06 > GMTKtoWaveSurfer

Decoding variables for viewing in Wavesurfer

vit2wsurf.pl is available from our CVS repository, /scripts. It's also available right here.

Basic step-by-step guide

First, make sure you're on Solaris (for the Wavesurfer bit -- you probably want Linux for gmtk), and that your dot files are set up for using Wavesurfer. .wishrc , .bashrc, and .tclshrc are all involved. See ~abezman/ for more.
  1. Probably begin with a training structure, with words observed.
  2. Run gmtkViterbiNew with the following options on top of the usual
    1. -dumpNames varsFile , where varsFile is a list, one per line, of variables you'd like dumped to a file
    2. -ofilelist utteranceList where utteranceList should be a list, one per line, of the basefile names of the utterances that you're decoding. These are the files that gmtkViterbiNew will dump variables to, so it's useful to prepend them with some directory. makeOFileList.pl, below, does that, though it's easy to do in a shell script, too.
  3. Copy the varsFile used above for -dumpnames, and modify it as follows, keeping the list and order of variables the same
    1. Change each variable name to something reasonable as a file extension b. After each variable name, it's possible to add, comma separated, the name of a mapping file for that variable, and a number indicating where to put it in the Wavesurfer window. See below for a sample line of a varlist file, and a sample mapping file.
  4. If desired, copy and shorten the filelist used above for -ofilelist. This file will list the binary files created by gmtkViterbiNew that are to be converted to Wavesurfer format
  5. Create a file list of wav files of the decoded utterances you'd like to see in Wavesurfer. Note: it may not end in blank line(s).
  6. Run vit2wsurf.pl --filelist filelist_from_step4 --variables variables_list_from_step3 --makeConfig name_for_new_config --launchWaveSurfer wav_list_from_step4. Other options are described below, and may be useful.
  7. When prompted, tell Wavesurfer the sample rate is 8000Hz, and the offset 1024 bytes. Having it remember these settings for all .wavs is helpful.

To open Wavesurfer to see existing transcriptions, run waversurfer.tcl [-config config_name_from_step5] [-filelist wav_list_from_step4], or just select .wav files from within Wavesurfer.

Sample variables file for vit2wsurf.pl

This file will make the script create a .psL, .L, .pl1, and .dg1 file for each utterance. The .pl1 and .dg1 files will contain some values from a mapping, while .L and .psL will contain the raw numbers from gmtk. pl1, dg1, and psL will be shown in the resulting Wavesurfer config.

----BEGIN------
#Name , [mapping] , [wavesurfer ordering]
#The wavesurfer ordering is relative, not absolute, so 1,2,3 is functionally equivalent to 0, 50000, 9999999.
psL  ,,3
L
pl1,    maps/pl1.map,   1
dg1,    maps/dg1.map,   2
-----END-----

Sample nas.map file

-----BEGIN-----
0 SIL
1 -
2 +
-----END-----

Full documentation of vit2wsurf.pl options

Syntax: vit2wsurf.pl --filelist filelist --variables varlist [--outputdir directory] [--framerate fps] [--debug] [--mapPhoneStates phoneState2Feat... file] [--makeConfig config-file] [--useConfig config-file] [--referenceTranscript directory] [--launchWaveSurfer wav-filelist]

Required:

--filelist filename

List of binary files dumped by gmtkViterbiNew. See sample command line above.

--variables filename

List of variables, as many and in the same order as they were dumped. One per line, comma separated, with whitespace padding ignored. name, mapping file, ordering for wavesurfer config. If there's no number on the end, the variable won't be shown in wavesurfer. See sample command line above. Mapping files are of the format
1 foo
2 bar

Optional:

--outputdir directory

Directory to put transcripts once they're made. Actually, for some variable "VOW", they'll go into a subdir outputdir/VOW/*.VOW . If not specified, working dir will be used.

--framerate fps

100 is default. Any positive value is legal.

--debug

More warnings and stuff.

--mapPhoneStates phoneState2Feat.VOCAB_file

If specified, all psG psL psT features will use the mapping derived from this file. Useful for some people's purposes, maybe?

--makeConfig string

Make a file string.conf in the wavesurfer config directory using the variables tagged with numbers in the variable list. If --launchWaveSurfer is invoked, this config will be used.

--useConfig string

Use an existing config with --launchWaveSurfer.

--referenceTranscript directory

Directory that has existing transcripts, with the same feature names. They can also be in subdirectories of the somedir/vow/ sort.

--launchWaveSurfer filename

The file must contain a list of .wav files to open. Currently, there's no way to launch Wavesurfer from the script without specifying wav files. I'll fix that shortly.

-- AriBezman - 11 Aug 2006