Xiangyao Yu

Xiangyao Yu

Postdoctoral Associate
Database Group
Department of Electrical Engineering and Computer Science
Massachusetts Institue of Technology
32-G930, 32 Vassar St, Cambridge, MA, 02139
Email: yxy AT mit DOT edu
[C.V.] [Google Scholar]

Welcome to my website!

I am a postdoctoral associate in the database group at CSAIL, MIT. I work with Prof. Michael Stonebraker and Prof. Samuel Madden. I completed my Ph.D. in Computer Science at MIT, and my Ph.D. advisor is Prof. Srinivas Devadas.

My research interests include Database Systems and Computer Architecture, with a focus on transaction processing, software-hardware codesigned systems, and in-cloud databases.

Before joining MIT, I earned my Bachelor of Science (B.S.) in Microelectronics Science and Engineering from Institute of Microelectronics at Tsinghua University, Beijing, China.

My research activities are focusing in three areas: (I) transaction processing, (II) software-hardware codesigned systems, and (III) in-cloud databases. Sample projects appear below.

Research Area I: Transaction Processing

Research Area II: Software-Hardware Codesigned Systems

Research Area III: In-Cloud Databases

Tardis Cache Coherence Protocol

Tardis is a cache coherence protocol that only requires Θ(log(N)) (for an N-core processor) storage per cache block while maintaining high performance and simplicity. Tardis is unique because it enforces the global memory order using physiological time (a combination of physical and logical time) instead of pure physical time. Tardis has no invalidations. Each read takes a logical lease on a cache block and a write jumps (or time travels) to the end of the lease. Tardis eliminates broadcasting or multicasting, and yet does not require a synchronized clock. We proved the correctness of Tardis and extended it to relaxed consistency models. Recently, we are prototyping Tardis on FPGA boards.

TicToc Concurrency Control Algorithm

TicToc is a new optimistic concurrency control algorithm that avoids the scalability and concurrency bottlenecks of prior Timestamp Ordering (T/O) schemes. TicToc relies on a novel and provably correct data-driven timestamp management protocol. Instead of assigning timestamps to transactions, this protocol assigns read and write timestamps to data items and uses them to lazily compute a valid commit timestamp for each transaction. TicToc removes the need for centralized timestamp allocation, and commits transactions that would be aborted by conventional T/O schemes. On a 40-core machine, TicToc outperforms other state-of-the-art concurrency control algorithms by up to 92% while reducing the abort rate by 3.3x.

1000-core Concurrency Control Evaluation

Computer architectures are moving towards an era dominated by many-core machines with dozens or even hundreds of cores on a single chip. Current database management systems (DBMSs), however, are not designed for this level of parallelism. In this project, we performed an evaluation of concurrency control for on-line transaction processing (OLTP) workloads on many-core chips. Our analysis shows that all algorithms fail to scale to this magnitude but for different reasons. In each case, we identify fundamental bottlenecks that are independent of the particular database implementation. We conclude that rather than pursuing incremental solutions, many-core chips may require a completely redesigned DBMS architecture that is built from ground up and is tightly coupled with the hardware. Our DBMS is open source on github (DBx1000).

Indirect Memory Prefetcher

Machine learning, graph analytics and sparse linear algebra-based applications are dominated by irregular memory accesses which have little temporal or spatial locality and thus cannot be captured by traditional prefetchers. A majority of these irregular accesses come from indirect patterns of the form A[B[i]]. We propose an efficient hardware indirect memory prefetcher (IMP) to capture this access pattern and hide latency. We also propose a partial cacheline accessing mechanism for these prefetches to reduce the network and DRAM bandwidth pressure from the lack of spatial locality. Evaluated on 7 applications, IMP shows 56% speedup on average (up to 2.3X) compared to a baseline 64 core system with streaming prefetchers.

Path Oblivious RAM

Oblivious RAM (ORAM) is a technique to hide the access pattern to an untrusted storage system. With ORAM, a curious adversary cannot tell what address the user is accessing when observing the bits moving between the user and the storage system. Path ORAM is an extremely simple Oblivious RAM protocol with a small amount of client storage. I have worked on Path ORAM design space exploration in secure processors, its performance optimizations, integrity verification and timing leakage protection. I have also proposed worked on secure ways to interact with tamper-resistant hardware with bounded information leakage.

Peer-Reviewed Publications

Technical Reports



Here is a list of my teaching/mentoring experiences: