The goal of this project is to create a videorealistic text-to-audiovisual speech synthesizer. The system should take as input any typed sentence, and produce as output an audio-visual movie of a face enunciating that sentence. By videorealistic we mean that the final audiovisual output should look like it was a videocamera recording of a talking human subject.
The following sequences are a sample of our results. They are sentences produced by
MikeTalk which were never uttered by the original speaker. Please let us know if you
are having problems with the formats. Also, please contact the authors if youwould like a short videotape depicting the results of this work. Note: The Quicktime sequences have been compressed with the Cinepak compressor for best playback speed (although this might affect picture quality slightly). |
"12345" |
SGI Quicktime AVI |
"678910" |
SGI Quicktime AVI |
"goodmorning sir..how are you feeling today?" |
SGI Quicktime AVI |
"you have received 10 email messages." |
SGI Quicktime AVI |
"your account balance is $2125." |
SGI Quicktime AVI |
"cat, dog, pig, cow, moose, horse, sheep." |
SGI Quicktime AVI |
"welcome to Bell Atlantic's home page." |
SGI Quicktime AVI |
"hello dad, i just wanted to wish you a very happy birthday." |
SGI Quicktime AVI |
"your hotel room has been reserved. thank you for staying at Sheraton." |
SGI Quicktime AVI |
"ask not what your country can do for you, ask what you can do for your country." |
SGI Quicktime AVI |
"hello kids, our lesson for today will be about how to add two fractions." |
SGI Quicktime AVI |
"my name is mike jones." |
SGI Quicktime AVI |
"please press the button on your left." |
SGI Quicktime AVI |
"i have to say that i think that OJ Simpson killed his wife." |
SGI Quicktime AVI |