Perceptual Awareness for Meeting Analysis




As part of the DARPA CALO project led by SRI, the Vision Interface group at MIT CSAIL has developed a perceptive laptop interface to support meeting understanding, and for multimodal human-computer interface. We are developing algorithms to assess the conversational state of meeting participants or users of the CALO environment from a personalized device, such as a laptop or tablet computer. Our ptablet device is purely passive, and offers the following cues to conversation or interaction state:


  • presence
  • attention
  • turn-taking
  • agreement and grounding gestures
  • emotion and expression cues
  • visual speech features


Additional information can be found here.



    ptablet device



Meeting analysis Demo

This video shows the full reconstruction (top view) of a 3-person meeting. Red lines indicate the gaze of the different participants.


Face tracking: Demo Demo

These videos show our integrated algorithms for estimating face pose/gesture and speaking activity



Speaker detection Demo Demo Demo


These videos show our algorithm for speaker detection using our multi-person tracking algorithm and a DOA (Direction-Of-Arrival)-based speaker detection



o       (Unpublished, link coming soon) D. Demirdjian. Automatic Camera Synchronization and Localization using Discourse-based Constraints.

o       P. Barthelmess, E. Kaiser, X. Huang and D. Demirdjian. Distributed pointing for multimodal collaboration over sketched diagrams. In ICMI’05.

o       E. C. Kaiser, D. Demirdjian, A. Gruenstein, X. Li, J. Niekrasz, M. Wesson and S. Kumar. A Multimodal Learning Interface for Sketch, Speak and Point Creation of a Schedule Chart. In Proceedings of ICMI’04 (demonstration), State College, PA. October 2004.

o       D. Demirdjian, K. Wilson, M. Siracusa and T. Darrell. Real-time Audio-Visual Tracking for Meeting Analysis. In Proceedings of ICMI’04 (demonstration), State College, PA. October 2004.

o       Kevin Wilson, Neal Checka, David Demirdjian and Trevor Darrell. Audio-Video Array Source Separation for Perceptual User Interfaces. In Proceedings of Workshop on Perceptive User Interfaces, 2001.