Tommi S. Jaakkola, Ph.D.
Thomas Siebel Professor of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society

MIT Computer Science and Artificial Intelligence Laboratory
Stata Center, Bldg 32-G470
Cambridge, MA 02139

tommi at csail dot mit dot edu

[home]   [papers]   [research]   [people]  


Some on-going projects

Molecular optimization Drug design relies increasingly on the ability to automatically optimize molecules towards better biochemical or biomedical properties. Our goal in this context is to accelerate and enable inverse molecular design by developing methods that can programmatically transform a precursor molecule into a refined version that satisfies user-specified property characteristics. Technical challenges in this context involve multi-resolution molecular representations and the ability to realize novel molecular structures as predictions.

Interpretable modeling The internal workings of complex, highly adaptive methods are not transparent and such methods fail to provide any understandable rationale for the decisions or predictions, making them questionable for mission critical applications or for easy maintenance. We integrate rationale generation (information usage) or functional transparency as part of the learning problem itself, developing adaptive algorithms whose predictions are not only effective but also directly verifiable by domain experts. Beyond verification, our methodology offers new ways to guide and communicate with learning algorithms.

Strategic prediction We explore a new class of structured prediction models where the resolution of the predicted outcome involves significant strategic interactions. Our game theoretic models map the input context to utilities and interactions, and guides the game dynamics into a near equilibrium. The predicted output from the model is a mixed strategy profile and each observation is thought of as a sample from this strategy profile. We explore different types of game theoretic models, associated dynamics, and theoretical guarantees of convergence, identifiability, and generalization.

Medical informatics We develop automated tools to reason about and extract information from medical reports and records. Beyond accuracy, our focus is on developing tools that are interpretable, verifiable, and directly transferable. For example, pathology reports, across organs, involve diverse language for expressing the main result. The challenge is to robustly model this diversity while highlighting the reasoning in an understandable form without losing the capacity to handle the full variety reports. For transferability, our goal is to ensure that the rules and annotations available for the method in one setting can be easily adapted to work in another without requiring extensive additional annotations and without requiring full data sharing. On the technology side, our approach builds on recent advances in interpretable deep learning, structured language modeling, and multi-cause inference.

Perturbation models Our goal is to develop a new flexible probabilistic modeling paradigm for high dimensional structured prediction problems. The approach is based on the idea of perturbation models and builds on decades of work on structured probability models and structured prediction as well as advances in relaxations of combinatorial optimization problems. Perturbation models, broadly construed, realize flexible probability models by linking latent randomization of parameters or configurations with combinatorial optimization. One of the main advantages of these models is that, despite the complex distributions they represent, and in contrast to typical structured probability models, they are easy to draw unbiased samples from. Randomization in these models is used as the modeling tool, together with the combinatorial structure of the problem embedded in the optimization part. We seek to understand, leverage, extend, and learn unique and powerful properties of these models as well as mitigate their deficiencies.

Syntactic and semantic parsing The best performing parsers today are typically discriminative in nature, i.e., they are tailored directly to the goal of predicting the correct (dependency) parse given the sentence. A rich set of features (associated parameters) are introduced into the parsing model in order to tie properties of candidate parse trees to the words (and tags) on the sentence level. This explosion of features is necessary to capture inherent linguistic variability but requires estimating a large number of parameters. Moreover, parsing with rich feature sets is also computationally challenging. Our goals in this context include developing parsimonious (e.g., tensor based) parameterizations, low-complexity inference algorithms, and novel semi-supervised approaches towards robust, cross-domain methods for parsing.

Recommender systems Recommender problems are typically formulated in terms of large matrices where the matrix dimensions refer to users and items. Since only limited information is available about each user, strong regularity assumptions are needed about the underlying rating matrix. Viewing recommender problems in terms of matrices is limiting, however, especially when recommendations involve inherent combinatorial constraints or biases as in recommending sets of items such as accessories, keywords, or more structured objects such as sentences. Our work in this context focuses on developing efficient algorithms for combinatorial recommendations, balancing scaling, statistical accuracy, and privacy.

Computational biology: In computational biology our motivation comes from the need to understand cellular mechanisms responsible for transcriptional control. Accurate predictions of this kind are based on identifying regularities across multiple heterogeneous, incomplete, or fragmented sources of information. Finding such regularities forces us to formulate, manipulate, and learn complex models that entertain a number of alternative hypotheses about observations. We have been developing methods to reveal comprehensive and predicative cis- and trans-regulatory networks.

Information retrieval/extraction We develop automatic on-demand methods for filling multi-way relational tables based on tailored web-queries. Our goal is to ascertain whether any selected multi-way relation holds and find evidential support in the form of articles for positive calls. The broader problem involves a novel combination of collaborative prediction, query formulation, and information extraction (IE). We couple query formulation with extraction, tailoring queries towards articles for which extraction succeeds; we explicitly leverage the fact that the queries and extraction tasks are coupled across the multi-way relations. Many pertinent problems fall naturally in our setting. For example, we seek to identify possible adulterations of food products, i.e., when a potentially harmful chemical is added to a food product in the manufacturing process (often willfully for financial gain). The relations sought in this case are between food products and candidate adulterants (in a context), and the task is to find support for possible relations across scientific, news, and social media articles.