Publications

Incorporating Content Structure into Text Analysis Applications
Christina Sauper, Aria Haghighi and Regina Barzilay
To appear in proceedings of EMNLP 2010

Simple Type-Level Unsupervised POS Tagging
Yoong Keok Lee, Aria Haghighi and Regina Barzilay
To appear in proceedings of EMNLP 2010

```
@InProceedings{haghighi-klein:2010:Short,
  author    = {Haghighi, Aria  and  Klein, Dan},
  title     = {An Entity-Level Approach to Information Extraction},
  booktitle = {Proceedings of the ACL 2010 Conference Short Papers},
  month     = {July},
  year      = {2010},
  address   = {Uppsala, Sweden},
  publisher = {Association for Computational Linguistics},
  pages     = {291--295},
  url       = {http://www.aclweb.org/anthology/P10-2054}
}                               
				
```
We present a generative model of template-filling in which coreference resolution and role assignment are jointly determined. Underlying template roles first generate abstract entities, which in turn generate concrete textual mentions. On the standard corporate acquisitions dataset, joint resolution in our entity-level model reduces error over a mention-level discriminative approach by up to 20%.
An Entity-Level Approach to Information Extraction
Aria Haghighi and Dan Klein
In proceedings of ACL 2010
[abstract] [paper] [slides] [bib]

```
@InProceedings{haghighi-klein:2010:NAACLHLT,
  author    = {Haghighi, Aria  and  Klein, Dan},
  title     = {Coreference Resolution in a Modular, Entity-Centered Model},
  booktitle = {Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association 
	for Computational Linguistics},
  month     = {June},
  year      = {2010},
  address   = {Los Angeles, California},
  publisher = {Association for Computational Linguistics},
  pages     = {385--393},
  url       = {http://www.aclweb.org/anthology/N10-1061}
}
					
```
Coreference resolution is governed by syntactic, semantic, and discourse constraints. We present a generative, model-based approach in which each of these factors is modularly en- capsulated and learned in a primarily unsupervised manner. Our semantic representation first hypothesizes an underlying set of latent entity types, which generate specific entities that in turn render individual mentions. By sharing lexical statistics at the level of abstract entity types, our model is able to substantially reduce semantic compatibility errors, resulting in the best results to date on the complete end-to-end coreference task.
Coreference Resolution in a Modular, Entity-Centered Model
Aria Haghighi and Dan Klein
In proceedings of HLT-NAACL 2010 [Best Paper Award]
[abstract] [paper] [slides] [bib]

```
@InProceedings{haghighi-vanderwende:2009:NAACLHLT09,
 author    = {Haghighi, Aria  and  Vanderwende, Lucy},
 title     = {Exploring Content Models for Multi-Document Summarization},
 booktitle = {Proceedings of Human Language Technologies:
 		  The 2009 Annual Conference of the North American Chapter
  of the Association for Computational Linguistics},
 month     = {June},
 year      = {2009},
 address   = {Boulder, Colorado},
 publisher = {Association for Computational Linguistics},
 pages     = {362--370},
 url       = {http://www.aclweb.org/anthology/N/N09/N09-1041}
}
				
```
We present an exploration of generative probabilistic models for multi-document summarization. Beginning with a simple word fre- quency based model (Nenkova and Vanderwende, 2005), we construct a sequence of models each injecting more structure into the representation of document set content and exhibiting ROUGE gains along the way. Our final model, HIERSUM, utilizes a hierarchical LDA-style model (Blei et al., 2004) to represent content specificity as a hierarchy of topic vocabulary distributions. At the task of producing generic DUC-style summaries, HIERSUM yields state-of-the-art ROUGE performance and in pairwise user evaluation strongly outperforms Toutanova et al. (2007)'s state-of-the-art discriminative system. We also explore HIERSUM's capacity to produce multiple 'topical summaries' in order to facilitate content discovery and navigation.

Exploring Content Models for Multi-Document Summarization
Aria Haghighi and Lucy Vanderwende
In proceedings of HLT-NAACL 2009
[abstract] [paper] [slides] [bib]

```
@InProceedings{haghighi-EtAl:2009:ACLIJCNLP,
	author    = {Haghighi, Aria  and  Blitzer, John  and  DeNero, John  and  Klein, Dan},
	title     = {Better Word Alignments with Supervised ITG Models},
	booktitle = {Proceedings of the Joint Conference of the 47th Annual Meeting 
	of the ACL and the 4th International Joint Conference on
	Natural Language Processing of the AFNLP},
	month     = {August},
	year      = {2009},
	address   = {Suntec, Singapore},
	publisher = {Association for Computational Linguistics},
	pages     = {923--931},
	url       = {http://www.aclweb.org/anthology/P/P09/P09-1104}
}						
						
```
This work investigates supervised word alignment methods that exploit inversion transduction grammar (ITG) constraints. We consider maximum margin and conditional likelihood objectives, including the presentation of a new normal form grammar for canonicalizing derivations. Even for non-ITG sentence pairs, we show that it is possible learn ITG alignment models by simple relaxations of structured discriminative learning objectives. For efficiency, we describe a set of pruning techniques that together allow us to align sentences two orders of magnitude faster than naive bitext CKY parsing. Finally, we introduce many-to-one block alignment features, which significantly improve our ITG models. Altogether, our method results in the best reported AER numbers for Chinese-English and a performance improvement of 1.1 BLEU over GIZA++ alignments.
Better Word Alignments with Supervised ITG Models
Aria Haghighi, John Blitzer, and Dan Klein
In proceedings of ACL-IJCNLP 2009
[abstract] [paper] [slides] [bib]

```
@InProceedings{haghighi-klein:2009:EMNLP,
  author    = {Haghighi, Aria  and  Klein, Dan},
  title     = {Simple Coreference Resolution with Rich Syntactic and Semantic Features},
  booktitle = {Proceedings of the 2009 Conference on 
	Empirical Methods in Natural Language Processing},
  month     = {August},
  year      = {2009},
  address   = {Singapore},
  publisher = {Association for Computational Linguistics},
  pages     = {1152--1161},
  url       = {http://www.aclweb.org/anthology/D/D09/D09-1120}
}								
				
```
Coreference systems are driven by syntactic, semantic, and discourse constraints. We present a simple approach which completely modularizes these three aspects. In contrast to much current work, which focuses on learning and on the discourse component, our system is deterministic and is driven entirely by syntactic and semantic compatibility as learned from a large, unlabeled corpus. Despite its simplicity and discourse naivete, our system substantially outperforms all unsupervised systems and most supervised ones. Primary contributions include (1) the presentation of a simple- to-reproduce, high-performing baseline and (2) the demonstration that most remaining errors can be attributed to syntactic and semantic factors external to the coreference phenomenon (and perhaps best addressed by non-coreference systems).
Simple Coreference Resolution with Rich Syntactic and Semantic Features
Aria Haghighi and Dan Klein
In proceedings of EMNLP 2009
[abstract] [paper] [bib]

We present a method for learning bilingual translation lexicons from monolingual corpora. Word types in each language are characterized by purely monolingual features, such as context counts and orthographic substrings. Translations are induced using a generative model based on canonical correlation analysis, which explains the monolingual lexicons in terms of latent matchings. We show that high-precision lexicons can be learned in a variety of language pairs and from a range of corpus types.
```
@InProceedings{haghighi-EtAl:2008:ACLMain,
  author    = {Haghighi, Aria  and  Liang, Percy  and  Berg-Kirkpatrick, Taylor  and  Klein, Dan},
  title     = {Learning Bilingual Lexicons from Monolingual Corpora},
  booktitle = {Proceedings of ACL-08: HLT},
  month     = {June},
  year      = {2008},
  address   = {Columbus, Ohio},
  publisher = {Association for Computational Linguistics},
  pages     = {771--779},
  url       = {http://www.aclweb.org/anthology/P/P08/P08-1088}
}				
						
```
Learning Bilingual Lexicons from Monolingual Corpora
Aria Haghighi, Percy Liang, Taylor Berg-Kirkpatrick, and Dan Klein
In proceedings of ACL 2008
[abstract] [paper] [slides] [bib]

The intersection of tree transducer-based translation models 
with n-gram language models results in huge dynamic 
programs for machine translation decoding.  We propose a 
multipass, coarse-to-fine approach in which the language 
model complexity is incrementally introduced.  In contrast 
to previous *order-based* bigram-to-trigram approaches, 
we focus on *encoding-based* methods, which use a 
clustered encoding of the target language.  Across various 
hierarchical encoding schemes and for multiple language 
pairs, we show speed-ups of up to 50 times over single-pass 
decoding while improving BLEU score.  Moreover, our entire 
decoding cascade for trigram language models is faster than 
the corresponding bigram pass alone of a bigram-to-trigram 
decoder.

@InProceedings{petrov-haghighi-klein:2008:EMNLP,
  author    = {Petrov, Slav  and  Haghighi, Aria  and  Klein, Dan},
  title     = {Coarse-to-Fine Syntactic Machine Translation using Language Projections},
  booktitle = {Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing},
  month     = {October},
  year      = {2008},
  address   = {Honolulu, Hawaii},
  publisher = {Association for Computational Linguistics},
  pages     = {108--116},
  url       = {http://www.aclweb.org/anthology/D08-1012}
}

Coarse-to-Fine Syntactic Machine Translation using Language Projections
Slav Petrov, Aria Haghighi and Dan Klein, EMNLP 2008
[abstract] [paper] [slides] [bib]

In EM and related algorithms, E-step computations distribute easily, because data items are independent given parameters. For very large data sets, however, even storing all of the parameters in a single node for the M- step can be impractical. We present a framework that fully distributes the entire EM procedure. Each node interacts only with parameters relevant to its data, sending messages to other nodes along a junction-tree topology. We demonstrate improvements over a MapReduce topology, on two tasks: word alignment and topic modeling.
```
@inproceedings{wolfe+haghighi+klein:2008a,
	Author = {Jason Wolfe and Aria Haghighi and Dan Klein},
	Booktitle = {ICML},
	Date-Added = {2008-05-07 15:08:27 -0700},
	Date-Modified = {2008-12-06 17:40:53 -0800},
	Title = {Fully Distributed EM for Very Large Datasets},
	Year = {2008}}				
						
```
Fully Distributed EM for Very Large Datasets
Jason Wolfe, Aria Haghighi and Dan Klein
In proceedings of ICML 2008
[abstract] [paper] [slides] [bib]

A Global Joint Model for Semantic Role Labeling (2008)
Kristina Toutanova, Aria Haghighi, and Christopher D. Manning
Computational Linguistics. Special Issue on Semantic Role Labeling

We present an unsupervised, nonparametric Bayesian approach to coreference resolution which models both global entity identity across a corpus as well as the sequential anaphoric structure within each document. While most existing coreference work is driven by pairwise decisions, our model is fully generative, producing each mention from a combination of global entity proper- ties and local attentional state. Despite be- ing unsupervised, our system achieves a 70.3 MUC F1 measure on the MUC-6 test set, broadly in the range of some recent supervised results.
```
		   
@InProceedings{haghighi-klein:2007:ACLMain,
  author    = {Haghighi, Aria  and  Klein, Dan},
  title     = {Unsupervised Coreference Resolution in a Nonparametric Bayesian Model},
  booktitle = {Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics},
  month     = {June},
  year      = {2007},
  address   = {Prague, Czech Republic},
  publisher = {Association for Computational Linguistics},
  pages     = {848--855},
  url       = {http://www.aclweb.org/anthology/P07-1107}
}			
				
```
Unsupervised Coreference Resolution in a Nonparametric Bayesian Model
Aria Haghighi and Dan Klein
In proceedings of ACL 2007
[abstract] [paper] [slides] [bib]

We present a novel method for creating A∗ estimates for structured search problems. In our approach, we project a complex model onto multiple simpler models for which exact inference is efficient. We use an optimization framework to estimate parameters for these projections in a way which bounds the true costs. Similar to Klein and Manning (2003), we then combine completion estimates from the simpler models to guide search in the original complex model. We apply our approach to bitext parsing and lexicalized parsing, demonstrating its effectiveness in these domains.
```
@InProceedings{haghighi-denero-klein:2007:main,
  author    = {Haghighi, Aria  and  DeNero, John  and  Klein, Dan},
  title     = {Approximate Factoring for A* Search},
  booktitle = {Human Language Technologies 2007: The Conference of the North American Chapter
 	of the Association for Computational Linguistics; Proceedings of the Main Conference},
  month     = {April},
  year      = {2007},
  address   = {Rochester, New York},
  publisher = {Association for Computational Linguistics},
  pages     = {412--419},
  url       = {http://www.aclweb.org/anthology/N/N07/N07-1052}
}
					
```
Approximate Factoring for A* Search
Aria Haghighi, John DeNero, and Dan Klein
In proceedings of HTL-NAACL 2007
[abstract] [paper] [slides] [bib]

A* Search via Approximate Factoring: Aria Haghighi, John DeNero, and Dan Klein
In proceedings of AAAI-2007

We investigate prototype-driven learning for primarily unsupervised sequence modeling. Prior knowledge is specified declaratively, by providing a few canonical examples of each target an- notation label. This sparse prototype information is then propagated across a corpus using distributional similarity features in a log-linear generative model. On part-of-speech induction in English and Chinese, as well as an information extraction task, prototype features provide substantial error rate reductions over competitive baselines and outperform previous work. For example, we can achieve an English part-of-speech tagging accuracy of 80.5% using only three examples of each tag and no dictionary constraints. We also compare to semi-supervised learning and discuss the system's error trends.
```
			
@InProceedings{haghighi-klein:2006:HLT-NAACL06-Main,
  author    = {Haghighi, Aria  and  Klein, Dan},
  title     = {Prototype-Driven Learning for Sequence Models},
  booktitle = {Proceedings of the Human Language Technology Conference of the NAACL, Main Conference},
  month     = {June},
  year      = {2006},
  address   = {New York City, USA},
  publisher = {Association for Computational Linguistics},
  pages     = {320--327},
  url       = {http://www.aclweb.org/anthology/N/N06/N06-1041}
}
					
```
Prototype-driven Learning for Sequence Models: Aria Haghighi and Dan Klein,
In proceedings of HTL-NAACL 2006 [Best Student Paper Award]
[abstract] [paper] [slides] [bib]

We investigate prototype-driven learning for primarily unsupervised grammar induction. Prior knowledge is specified declaratively, by providing a few canonical examples of each target phrase type. This sparse prototype information is then propagated across a corpus using distributional similarity features, which augment an otherwise standard PCFG model. We show that distributional features are effective at distinguishing bracket labels, but not determining bracket locations. To improve the quality of the induced trees, we combine our PCFG induction with the CCM model of Klein and Manning (2002), which has complementary strengths: it identifies brackets but does not label them. Using only a handful of prototypes, we show substantial improvements over naive PCFG induction for English and Chinese grammar induction.
```
@InProceedings{haghighi-klein:2006:COLACL,
  author    = {Haghighi, Aria  and  Klein, Dan},
  title     = {Prototype-Driven Grammar Induction},
  booktitle = {Proceedings of the 21st International Conference on 
	Computational Linguistics and 44th Annual Meeting 
	of the Association for Computational Linguistics},
  month     = {July},
  year      = {2006},
  address   = {Sydney, Australia},
  publisher = {Association for Computational Linguistics},
  pages     = {881--888},
  url       = {http://www.aclweb.org/anthology/P/P06/P06-1111}
}
					
```
Prototype-driven Grammar Induction
Aria Haghighi and Dan Klein,
In proceedings of COLING-ACL 2006
[abstract] [paper] [slides] [bib]

Robust Textual Inference via Graph Matching (2005)
Aria D. Haghighi, Andrew Y. Ng, Christopher D. Manning
Proceedings of Empirical Methods in NLP (EMNLP). 2005.

Robust Textual Inference Using Diverse Knowledge Sources
Rajat Raina, Aria Haghighi, Christopher Cox,
Christopher Cox, Jenny Finkel, Jeff Michels, Kristina Toutanova
Bill MacCartney, Marie-Catherine de Marneffe, Christopher D. Manning, Andrew Y. Ng
In proceedings of PASCAL Challenge Workshop in Recognizing Textual Entailment 2005

A Joint Model for Semantic Role Labeling (2005)
Aria Haghighi, Kristina Toutanova, Chris Manning
In Proceedings of CoNLL-2005: Shared Task

Joint Learning Improves Semantic Role Labeling (2005)
Kristina Toutanova, Aria Haghighi, and Chris Manning,
Proceedings of the 43rd Annual Meeting of the ACL 2005.

Aria D. Haghighi

Publications