Skip to content.

-- KarenLivescu - 15 Dec 2005

WS06 > DataBases (r1.9)

Possible databases for use at WS06

This is a listing of the databases that have been mentioned for possible use at WS06.

Audio only

Audio-visual

  • AVICAR (UIUC)
    • Multi-speaker
    • Read speech consisting of isolated digits and letters, phone numbers, and sentences
    • In-car environment with varying car noise levels
    • Described in Lee et al., ICSLP 2004
    • See the AVICAR project web site
    • Manual face and lip segmentations: attached at bottom of page

  • CMU audio-visual speech processing project data (CMU)
    • 10 speakers
    • Isolated words
    • Lip tracking parameters available
    • See the project web site

  • CUAVE (Clemson)
    • 36 speakers, each reading 60 isolated digits and 60 connected digits
    • Also a number of speaker-pair recordings
    • Studio environment
    • See RelevantPapers for some example results

  • New continuous numbers database (MIT)?
    • Stay tuned...

Transcribed data

For sanity checks, or in order to do separate acoustic/pronunciation modeling experiments

Articulatory measurement data

For detailed study of articulatory phenomena. The current thinking is that we will not use these, as they lack some useful articulatory information and are perhaps too difficult to translate to the discrete feature values that we intend to use. However, this is up for discussion.

Discussion area for this page:

-- KarenLivescu - 08 Dec 2005