David Sontag's Home Page
E-mail: dsontag {@ | at} mit.edu
Clinical machine learning group website
 I am a Professor of
  Electrical Engineering and
    Computer Science at MIT, part of
    the Institute for Medical
    Engineering & Science,
    the Computer Science and Artificial
    Intelligence Laboratory, and the J-Clinic for Machine
    Learning in Health.
My research focuses on advancing machine learning and artificial intelligence, and using these to transform health care. 
Previously, I was an Assistant Professor of Computer Science and
  Data Science at New York University. 
News
-  I am on partial leave from MIT for 2025 and am CEO of
  Layer Health, which I co-founded
  with several former MIT students.
  
 -  Our MIT Machine Learning for Healthcare class is available
  on MIT
    OpenCourseWare
  (all videos).
 
Teaching
Spring '17, '19, '20, '21, '22, '25: Machine Learning
  for Healthcare (6.7930, HST.956)
Fall '20, '21, '22: Introduction to Machine Learning (6.036)
Fall '17, '18, '19: Machine Learning (6.867)
Fall 2016: Inference and Representation (DS-GA-1005 and CSCI-GA.2569)
Spring 2016: Introduction to Machine Learning (CSCI-UA.0480-007)
Selected papers:
- 
D. Choo, C. Squires, A. Bhattacharyya,
D. Sontag. Probably
    Approximately Correct High-Dimensional Causal Effect Estimation
    Given a Valid Adjustment Set. 4th Conference on Causal
    Learning and Reasoning (CLeaR), 2025.
 - 
H. Lang, D. Sontag,
A. Vijayaraghavan. Theoretical
    Analysis of Weak-to-Strong Generalization,  Conference on
    Neural Information Processing Systems (NeurIPS), 2024.
 - 
K. Kuang, F. Dean, J. Jedlicki, D. Ouyang, A. Philippakis, D. Sontag,
A. Alaa. 
    Med-Real2Sim: Non-Invasive Medical Digital Twins using
    Physics-Informed Self-Supervised Learning. Conference on
    Neural Information Processing Systems (NeurIPS), 2024.
 -  S. Hegselmann, A. Buendia, H. Lang, M. Agrawal, X. Jiang,
  D. Sontag. TabLLM: Few-shot Classification of Tabular Data with Large Language
Models. 26th International Conference on Artificial
  Intelligence and Statistics (AISTATS), 2023.
 -  M. Agrawal, S. Hegselmann, H. Lang, Y. Kim,
  D. Sontag. Large
    Language Models are Few-Shot Clinical Information
  Extractors. Conference on Empirical Methods in Natural
    Language Processing (EMNLP), 2022.
 -  H. Mozannar, A. Satyanarayan,
  D. Sontag. Teaching
  Humans When To Defer to a Classifier via
  Exemplars. AAAI, 2022.
 -  L. Murray, D. Gopinath, M. Agrawal, S. Horng, D.
  Sontag,
  D. Karger. MedKnowts:
  Unified Documentation and  Information Retrieval for Electronic
  Health Records. UIST, 2021. (Video)
 - Z. Hussain, R. Krishnan,
  D. Sontag. Neural
    Pharmacodynamic State Space Modeling. ICML, 2021.
 -  R. Kodialam, R. Boiarsky, J. Lim, N. Dixit, A. Sai,
  D. Sontag. Deep
  Contextual Clinical Prediction with Reverse
    Distillation. AAAI, 2021.
 
 - M. Oberst, D. Sontag. Counterfactual Off-Policy Evaluation with
     Gumbel-Max Structural Causal Models, ICML 2019.
 
 - I. Chen, F. Johansson,
   D. Sontag. Why Is My
     Classifier Discriminatory?, NeurIPS, 2018.
 
 -  U. Shalit, F. Johansson, D. Sontag. Estimating Individual Treatment Effect: Generalization Bounds and Algorithms. 34th International Conference on Machine Learning (ICML), 2017. [code]  [Slides] 
 -  M. Rotmensch, Y. Halpern, A. Tlimat, S. Horng,
  D. Sontag. Learning a Health Knowledge Graph from Electronic Medical Records, Nature Scientific Reports, July 2017. Supplementary
 -  R. Krishnan, U. Shalit, D. Sontag. Structured Inference Networks for Nonlinear State Space Models, Thirty-First AAAI Conference on Artificial Intelligence, Feb. 2017. [code] Older version
 -  Y. Halpern, S. Horng, Y. Choi,
  D. Sontag. Electronic
    Medical Record Phenotyping using the Anchor and Learn
    Framework. Journal of the American Medical Informatics
    Association (JAMIA), April
  2016. [html]
  [code]  [Slides]
 -  Y. Kim, Y. Jernite, D. Sontag, S. Rush. Character-Aware Neural Language Models, Thirtieth AAAI Conference on Artificial Intelligence, Feb. 2016. [code] [Slides] Video
 - X. Wang, D. Sontag, F. Wang. Unsupervised Learning of Disease Progression Models. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Aug. 2014. [Slides] BibTex
 - E. Brenner, D. Sontag. SparsityBoost: A New Scoring Function for Learning Bayesian Network Structure. Uncertainty in Artificial Intelligence (UAI) 29, July 2013. BibTex [arXiv]
 - S. Arora, R. Ge, Y. Halpern, D. Mimno, A. Moitra, D. Sontag, Y. Wu, M. Zhu.  A Practical Algorithm for Topic Modeling with Provable Guarantees. 30th International Conference on Machine Learning (ICML), 2013. Supplementary BibTex
 - T. Koo, A. Rush, M. Collins, T. Jaakkola, and D. Sontag. Dual Decomposition for Parsing with Non-Projective Head Automata. Empirical Methods in Natural Language Processing (EMNLP), 2010. Best paper award. BibTex
 - T. Jaakkola, D. Sontag, A. Globerson,
  M. Meila. Learning
    Bayesian Network Structure using LP Relaxations. 13th International Conference on Artificial Intelligence
    and Statistics (AI-STATS),
  2010. BibTex 
 - D. Sontag. Approximate 
				Inference in Graphical Models using LP 
				Relaxations. Ph.D. thesis, Massachusetts Institute of Technology, 2010. 
George M. Sprowls Award for the best doctoral theses in Computer 
				  Science at MIT (2010). BibTex
 - D. Sontag, T. Meltzer, A. Globerson, Y. Weiss, T. Jaakkola. Tightening
LP Relaxations for MAP using Message Passing. Uncertainty
in Artificial Intelligence (UAI) 24, July 2008. Best paper award. [code] BibTex
 - D. Sontag, T. Jaakkola. New
Outer Bounds on the Marginal Polytope. Neural Information Processing Systems
(NIPS) 20, Dec. 2007. Outstanding student paper award. Addendum BibTex
 
Code (for latest, see
  our Github repo)
Download Python code for learning topic models (corresponds to ICML '13 paper). See also David Mimno's Mallet-compatible Java implementation.
Download code for learning Bayesian network structure (corresponds to UAI '13 SparsityBoost paper).
Download C++ code for MAP inference in graphical models (corresponds to 
UAI '12 paper; see readme file).
Low-dimensional embeddings of
medical concepts (corresponds to AMIA CRI '16 paper)
DeepDiagnosis from longitudinal clinical data (corresponds to MLHC '16 paper)
omop-learn, Python package
  for deep learning on longitudinal health data