Jacob Eisenstein

I'm a Postdoctoral Fellow in the Machine Learning Department at Carnegie Mellon University. I graduated from MIT in 2008, and spent a year as a Beckman Fellow at the University of Illinois.

I work on machine learning approaches to natural language processing, focusing on connections between language and the external world. I'm especially interested in social media, discourse, non-verbal communication and unsupervised learning. In the past, I worked on intelligent user interfaces.

publications | dissertation | for non-specialists | code

Selected Recent Publications

Reading to Learn: Constructing Features from Semantic Abstracts. Eisenstein, Clarke, Goldwasser and Roth. EMNLP 2009.
Given a machine learning problem, can a system acquire better features by "reading" text written by domain experts? We develop an model that extracts relational features from text, improving learning.
Learning Document-Level Semantic Properties from Free-text Annotations. Branavan, Chen, Eisenstein, and Barzilay. Journal of Artificial Intelligence Research 34, 2009.
Informal "keyphrase" annotations can be used to predict document-level semantics, by modeling the latent annotation paraphrase structure. The resulting system automatically generates pro/con lists from reviews of products and services.
Unsupervised Multilingual Learning for Part-of-Speech Tagging. Naseem, Snyder, Eisenstein and Barzilay. Journal of Artificial Intelligence Research 36, 2009.
Unsupervised part-of-speech tagging works better when applied to multiple languages simultaneously.
Bayesian Unsupervised Topic Segmentation. Eisenstein and Barzilay. EMNLP 2008.
A new method to segment text and speech transcripts into topically-coherent units, using both lexical cohesion and cue phrases. First paper to learn cue phrases without supervision, by combining them with cohesion in a generative Bayesian framework.
Gestural Cohesion for Topic Segmentation. Eisenstein, Barzilay, and Davis. ACL 2008.
Coherent discourse topics contain internally consistent gestural-forms, paralleling a similar phenomenon in the distribution of lexical items. Automatically extracted gesture features improve unsupervised topic segmentation on dialogues.

Election Prediction

In 2006, I helped lay the groundwork for Nate Silver's media empire by developing a statistical model for election forecasting. It did pretty well.

Contact

Machine Learning Department
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213
jacobe@gmail.com