Prof. Tim Kraska
I am an Associate Professor of Electrical Engineering and Computer Science at MIT, where I am part of the Data Systems Group in CSAIL, and a Director of Applied Science at Amazon Web Services (AWS). I co-direct MIT’s Generative AI Impact Consortium (MGAIC), the Data Systems and AI Lab (DSAIL@CSAIL), and the new Everest@CSAIL initiative. I also co-founded Instancio and Einblick Analytics, both of which were acquired.
Before joining MIT, I was an Assistant Professor at Brown University, and spent time at Google Brain. I am a Sloan Research Fellow and have received several awards, including the VLDB Early Career Research Contribution Award, the Intel Outstanding Researcher Award, the VMware Systems Research Award, Brown University’s Early Career Research Achievement Award, an NSF CAREER Award, and multiple best paper and demo awards at VLDB, SIGMOD, and ICDE.
2026 Recruiting
Our group will have multiple openings for Fall 2026 Ph.D. students and Postdoctoral Fellows.
If you are a prospective Ph.D. applicant interested in our current research directions (see below), please do not email me directly. Instead, submit your application through the MIT EECS graduate admissions portal and list me as a preferred reader. In addition, please mention our group/me in your research statement and briefly explain why you are interested in working with us.
If you are a prospective Postdoctoral Fellow, please feel free to email me directly with (1) your CV and (2) a brief description of your prior research and how it aligns with our current research interests.
Current Research
My current research focuses on agentic systems and the use of large language models (LLMs) and artificial intelligence (AI) for data-centric problems and systems building. At MIT, my work centers on the following areas:
-
AI for Complex Systems Development: With our D4 project, we explore how AI will transform the design and development of large, complex software systems. The guiding question is: How should we build software in a world where most code is written—or co-written—by AI? This includes rethinking development practices and exploring how to modernize and evolve existing, data-centric software stacks.
-
Declarative AI Pipelines: With Palimpzest, we are developing a declarative framework for AI pipelines, particularly for data-intensive applications. We were the first to introduce the idea of semantic operators, which can then be optimized as part of an AI-enabled query plan. PZ is open source and is being tested in various industry projects to quickly extract structured information from unstructured data.
-
Data Science Agents: We introduced the first benchmark for data-science agents, called KramaBench, and are now investigating new techniques for building deep research and data-science agents—from long-context optimization to graph-based reasoning and beyond.
My work at Amazon Web Services (AWS) focuses largely on production deployments of AI agents. For example, I lead the science teams behind several components of AgentCore, Q SQL, Bedrock Structured Knowledge Bases, and several 1P agents to be released in the coming months.
In parallel, I continue to advance our long-standing research on Machine Learning for Systems. For example, with BRAD we explore how we can virtualize and automatically optimize data infrastructure; applying many of the insights from our previous work on learned indexing, learned scheduling, and query optimization.
Broader Impact
We have a long history of translating academic research into real systems that are widely deployed. For example, at Amazon my team developed Redshift’s AI-driven scaling and optimization capabilities based on MIT’s SageDB project. Similarly, our work on learned multi-dimensional indexing and data layouts directly informed the design of Amazon Redshift’s Multi-Dimensional Data Layouts (MDDL) feature.
The technology underlying Einblick — acquired by Databricks — was similarly rooted in Northstar, a system we created at MIT and Brown. At Einblick, we also developed one of the first natural-language-to-SQL engines for data exploration and an early data science agent — predating today’s AI-driven landscape.
Beyond our direct impact on industry, our work helped establish what is now widely known as Algorithms with Predictions or ML-enhanced algorithms and data structures. For example, since we introduced Learned Indexes in 2018, researchers have developed well over a hundred variants, extended the idea to domains such as sorting, network routing, and genomic search, and adapted it to modern hardware platforms including multi-core CPUs, NVM, and RDMA.
Publications
For recent and older publications see DBLP or Google Scholar.
