Welcome to my website!
I am a postdoctoral associate in the database group at CSAIL, MIT.
I work with Prof. Michael Stonebraker and Prof. Samuel Madden.
I completed my Ph.D. in Computer Science at MIT, and my Ph.D. advisor is Prof.
My research interests include Database Systems and Computer Architecture, with a focus on transaction processing, software-hardware codesigned systems, and in-cloud databases.
Before joining MIT, I earned my Bachelor of Science (B.S.) in Microelectronics Science and Engineering from Institute of Microelectronics at
Tsinghua University, Beijing, China.
My research activities focus in three areas: (I) transaction processing
, (II) software-hardware codesigned systems
, and (III) in-cloud databases
. Sample projects appear below.
Research Area I: Transaction Processing
Concurrency Control in 1000 Cores
Computer architectures are moving towards many-core machines with dozens or even hundreds of cores on a single chip, which the current database management systems (DBMSs) are not designed for.
We performed an evaluation of concurrency control for on-line transaction processing (OLTP) workloads on many-core chips.
shows that all algorithms fail to scale to this level of parallelism. For each algorithm, we identified artificial and fundamental bottlenecks.
We conclude that rather than pursuing incremental solutions, many-core chips may require a completely redesigned DBMS architecture that is built from ground up and is tightly coupled with the hardware.
Our DBMS is open source on github (DBx1000
[Related Publication: VLDB'14
TicToc Concurrency Control
is an optimistic concurrency control (OCC) algorithm that avoids the timestamp allocation bottleneck in traditional schemes. TicToc relies on a novel and provably
correct data-driven timestamp management protocol. Instead of assigning timestamps to transactions, this protocol
assigns read and write timestamps to data items and uses them to lazily compute a valid commit timestamp for each transaction.
TicToc removes the need for centralized timestamp allocation, and commits transactions that would be aborted by
conventional T/O schemes. On a 40-core machine, TicToc outperforms other state-of-the-art concurrency control algorithms
by up to 92% while reducing the transaction abort rate by 3.3x.
[Related Publication: SIGMOD'16
Research Area II: Software-Hardware Codesigned Systems
IMP: Indirect Memory Prefetcher
Important applications like machine learning, graph analytics, and sparse linear algebra are dominated by irregular memory accesses which have little temporal or spatial locality and are difficult to prefetcher using traditional techniques.
A majority of these irregular accesses come from indirect patterns of the form A[B[i]].
We propose an efficient hardware indirect memory prefetcher (IMP)
to hide memory latency of this access pattern.
We also propose a partial cacheline accessing mechanism to reduce the network and DRAM bandwidth
pressure from the lack of spatial locality. Evaluated on seven applications, IMP showed 56% speedup on average (up to 2.3x) compared to a baseline streaming prefetchers on a 64 core system.
[Related Publication: MICRO'15
Tardis Cache Coherence Protocol
is a timestamp-based cache coherence protocol that scales to 1000 cores while maintaining simplicity and good performance. Tardis is different from conventional coherence protocols that typically use the invalidation mechanism for a write to propagate to shared cached copies. Instead, each read copy acquires a lease such that a write operation happens only after the lease expires. Tardis uses logical instead of physical leases such that a write is not blocked waiting for a lease. Instead, a write happens immediately by "jumping ahead" in logical time by changing the timestamps. We have proven
the correctness of Tardis and extended it to relaxed consistency models
[Related Publication: PACT'15
Research Area III: In-Cloud Databases
Near Cloud Storage Computing
Modern cloud platforms disaggregate computation and storage into separate services. In this project, we explored the idea of using limited computation inside the simple storage service (S3) offered by AWS to accelerate data analytics. We use the existing S3 Select
feature to accelerate not only simple database operators like select and project, but also complex operators like join, group-by, and top-K. We propose optimization techniques for each individual operator and demonstrated more than 6x performance improvement over a set of representative queries.