In Fall 2014 I will be starting as a graduate student at UC Berkeley working with Ben Recht, broadly on topics related to machine learning.

I previously was a graduate student in the database group at CSAIL, MIT where I earned my master's degree in EECS on Silo. My thesis advisor was Samuel Madden. Before that, I was an undergraduate at UC Berkeley, and received dual degrees in computer science and mechanical engineering. At Berkeley, I was a member of the RADLab (now AMPLab).

My research interests are all over the place. I have worked on multicore databases, database security, and more recently statistical database privacy. This summer, I'm currently taking a break from all of that, and working to build a fast, open source inference engine for various non-parametric Bayesian models.


Fast Databases with Fast Durability and Recovery through Multicore Parallelism
Wenting Zheng, Stephen Tu, Eddie Kohler, Barbara Liskov
To appear in OSDI 2014.

Anti-Caching: A New Approach to Swapping in Main Memory OLTP Database Systems [PDF]
Justin DeBrabant, Andrew Pavlo, Stephen Tu, Michael Stonebraker, Stan Zdonik
To appear in VLDB 2014.

Speedy Transactions in Multicore In-Memory Databases [PDF] [Slides] [Code]
Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, Samuel Madden
SOSP 2013.

Processing Analytical Queries over Encrypted Data [PDF] [Slides] [Code]
Stephen Tu, M. Frans Kaashoek, Samuel Madden, Nickolai Zeldovich
VLDB 2013.

The HipHop Compiler for PHP
Haiping Zhao, Iain Proctor, Minghui Yang, Xin Qi, Mark Williams, Guilherme Ottoni, Charlie Gao, Andrew Paroski, Scott MacVicar, Jason Evans, Stephen Tu
OOPSLA 2012.

The Case for PIQL: A Performance Insightful Query Language [PDF]
Michael Armbrust, Nick Lanham, Stephen Tu, Armando Fox, Michael J. Franklin and David A. Patterson
SoCC 2010.

PIQL: A Performance Insightful Query Language For Interactive Applications [PDF]
Michael Armbrust, Stephen Tu, Armando Fox, Michael J. Franklin, David A. Patterson, Nick Lanham, Beth Trushkowsky, and Jesse Trutna
Demo, SIGMOD 2010.


Machine Learning Classification over Encrypted Data [PDF]
Raphael Bost, Raluca Ada Popa, Stephen Tu, and Shafi Goldwasser


Fast Transactions for Multicore In-Memory Databases [PDF]
Masters Thesis

Random notes

The Dirichlet-Multinomial and Dirichlet-Categorical models for Bayesian inference [PDF]

[6.885 lecture notes] Introduction to query processing for encrypted databases [PDF]

[6.885 lecture notes] Introduction to differential privacy [PDF]

Differentially private random projections [PDF]
Simple extension of the work of Kenthapadi et al. on differentially private randomized projections.

Derivation of EM for discrete Hidden Markov Models [PDF]

Techniques for Implementing Concurrent Data Structures on Modern Multicore Machines [PDF] [Github]
Hackers at Berkeley (H@B) Workshop

Class Projects

Mino: Data-driven type inference for Python [PDF] [Code]
MIT 6.867 Fall 2012 Final Project.

Intel Transactional Memory Extensions in QEMU [PDF] [Code]
Sebastien Dabdoub and Stephen Tu
MIT 6.828 Fall 2012 Final Project.

Raytracer [Report]
UC Berkeley CS 184 Project.