Manual feature transcriptions
We have collected a small set of manual transcriptions at the articulatory feature level, to be used as "ground truth" for testing feature classifiers and forced alignments. See FeatureTranscription for complete details, and TranscriptionNotes for the transcription guidelines. The data collection and analysis are summarized in
K. Livescu, A. Bezman, N. Borges, L. Yung, O. Cetin, J. Frankel, S. King, M. Magimai-Doss, X. Chi, and L. Lavoie, "Manual transcription of conversational speech at the articulatory feature level", in Proc. ICASSP, Honolulu, April 2007.
If you use the data, please cite this paper in any resulting publications.
Version 1 of the data is available: WS06AFSR_manual_trans_v1.tgz
Please note:
- IMPORTANT: The .phn files were generated during the 1st pass "hybrid" labeling. These are not actually part of the final transcriptions; they are being provided for completeness only. They were not modified after the 1st pass and may not match the feature tiers. We make no claims about their accuracy or usefulness!
- We are from time to time still finding errors. Please let us know if you find errors or have trouble with the data. We may release an updated version if there are further corrections. Please refer to the current version of the data as the "WS06AFSR manual transcriptions, version 1".
- The download does not include the waveform files. The STP waveforms are from the Switchboard database; the SVB ones are segments of Switchboard utterances as defined in the SVitchboard distribution. If you have a license for Switchboard 1, we will be glad to provide the waveforms as well for convenience--please contact Karen Livescu at klivescu@csail.mitNOSPAM.edu
- The transcriptions are in a simple ASCII format. For easier viewing, two WaveSurfer configuration files are included: one for viewing a single transcriber's labels and one for viewing both transcribers' side by side. These have been tested with WaveSurfer 1.8.5 on Windows XP. Place the config files in your WaveSurfer config directory (in version 1.8 the default is "Documents and Settings\[username]\.wavesurfer\1.8\configurations"), open the desired wav file, and choose one of these two configs. For more information about/to download WaveSurfer, see the WaveSurfer site.
- When opening wav files in WaveSurfer, set Sample Rate to 8000 and Read Offset to 1024 bytes when prompted.
- The data are divided into subdirectories as follows:
- SVB/ : The SVitchboard utterances
- ll/ : Transcriptions done by Lisa Lavoie
- xc/ : Transcriptions done by Xuemin Chi
- STP/ : The STP utterances
- allfeature/ : The utterances transcribed using an all-feature format (no .phn tier in the 1st pass). Has ll/ and xc/ subdirectories as above.
- hybrid/ : The utterances transcribed using a phone-feature hybrid format in the 1st pass. Has ll/ and xc/ subdirectories as above.
- The transcription files are named [wavfile-tag].[feature-name] and each line is "[start-time] [end-time] [feature-value]". Times are in seconds.
- The .wd files were generated from the Mississippi State University word alignments, but may have been modified by the WS06 transcribers. We are responsible for any problems with the .wd files.
-- KarenLivescu - 19 Apr 2007