Multimodal Interaction With Gestures (& Speech)
The main objective of this work is to design a perceptual user interface that provides:
· the full body pose of a user (location and orientation of arms, head, hands)
· the detection of gestures (e.g. waving, pointing)
and combine tracking/gesture output with additional modalities (e.g. speech) for Human-Robot or Human-Computer interaction, e.g. virtual world navigation and interaction, video game, interaction with an avatar.
In order to estimate body pose, we develop
The following videos show the integration of our multimodal user
interface with various applications (our system has also been integrat
Multi-modal studio: Demo
Application similar to Bolt’s Put-That-There. The application allows a user to create and manipulate various 3D geometric shapes in a virtual world using speech and gesture commands
Virtual navigation: Demo
Utilization of body pose
estimation to control navigation in a Virtual World/Video Game. In this
application, the user can move in the virtual world by moving/tilting his
body or by pointing at a derid
Control of virtual (eg. desktop
windows/applications/mouse cursor) and real objects (eg.
lamps/projectors/sound system) using speech and gesture commands. This system
was an outcome of the OXYGEN project and result
Tracking videos (no integration)
User sitting Demo
User standing/turning Demo
approach for tracking consists in fitting a 3D (CAD) model of the person to
track to range data (obtain
recognition is perform
D. Demirdjian, L. Taycher, G. Shakhnarovich, K. Grauman and T. Darrell. Avoiding the `Streetlight Effect': Tracking by Exploring Likelihood Modes. In ICCV’05. [PDF].
S. Wang, A. Quattoni, L.
Morency, D. Demirdjian, T. Darrell, Hidden Conditional Random Fields for
Gesture Recognition, Proce
L. Taycher, G. Shakhnarovich, D. Demirdjian, T. Darrell, Conditional Random People: Tracking Humans with CRFs and Grid Filters, Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2006. [PDF]
D. Demirdjian, T. Ko and T.
Demirdjian. Combining Geometric- and View-Bas
D. Demirdjian and T. Darrell. 3-D Articulat