Son1's 6.830 Papers

This is an outline of database papers which I collected as part of my reading and studying for 6.830 this semester. Every paper that was assigned in class is in this list; in addition, I've added a few of the other papers that I tracked down, read, and found interesting over the course of the semester. (If the paper name is listed in bold then it was assigned in class, otherwise it's my own addition.)

I've tried to reorganize the papers into categories based on some rough notion of common topic -- but these topics weren't an explicit part of the way the course was structured, and the ordering of the papers on this page isn't exactly the order in which the papers were presented in class.

I've also added a few comments to each paper describing how it was covered in class, some idea of its contents, and a few words about how it might relate to other work. These comments are woefully incomplete, as they're mostly derived from my own hand-written notes made while studying for the class exams (or from my spotty memory while writing this list). At any rate, any mistakes in this webpage are my own.

At some point fairly soon, I'm going to replace the links to local copies of these papers with links that point to more permanent repositories on the web.

Relational Pre-History

Stonebraker & Hellerstein, "What Goes Around Comes Around"

The paper assigned on the first day of class. "A summary of 35 years of data model proposals," this review gives Stonebraker's views on IMS, CODASYL, and the relational algebra, and why he feels the relational algebra "won out" in the end. Parts of this paper were given verbatim (we later realized) as a lecture in class.

Charles Bachman, "The Programmer as Navigator" [local]

Bachman's 1973 Turing Award Lecture -- Bachman was the proponent of an early competitor to the relational data model, a graph-structured model (and access-language) embodied in a system known as CODASYL. Stonebraker called CODASYL "the machine language of database access," and one of his points from the beginning of the class was the "battle" which was fought in the '70s over whether database access languages should be low-level and procedural (in the style of CODASYL and IMS), or higher-level and declarative (the relational algebra). Eventually, the higher-level declarative languages carried the day.

"Classical" Relational Databases: System R and Its Successors

Codd, "A Relational Model of Data for Large Shared Data Banks"

Ted Codd pioneered the "relational data model," which is the datal model that underlies all modern relational databases. In this paper, he lays out the data model as well as the "relational algebra," which is a language that expresses new relational models from previously defined ones, using a grammar of relational operators (select, project, join, aggregate, etc). The relational algebra is the (theoretical) basis for declarative query languages on relational models such as SQL, although Codd (and others) would later argue that SQL was a poor implementation of the relational algebra.

Hellerstein & Stonebraker, "Anatomy of a Database System" [local]

High-level, modern overview of "standard" relational database management systems.

Astrahan et al., "System R: Relational Approach to Database Management" [local]

System R, a research system from IBM, was one of the first complete implementations of a working RDMBS. It formed the basis of the IBM industrial product DB2, and many of the features of modern RDBMSs were pioneered in versions of System R.

Selinger, Astrahan, Chamberlin, Lorie, & Price, "Access Path Selection in a Relational Database Management System" [local]

One of the original papers in query optimization -- points out that query plans should often be formulated as "left-deep" tree structures, and then presents an algorithm (based on dynamic programming) to optimize the ordering of those left-deep plans based on rudimentary statistics about the data in the database.

Mannino, Chu, & Sager, "Statistical Profile Estimation in Database Systems" [local]

The statistics that were considered in Selinger et al were pretty rudimentary. This paper outlines how to use much more extensive statistics about the relations in your RDBMS to do a better job at query optimization. Statistics like these, or something very similar, form the core of a database system like Postgres.

Chou & DeWitt, "An Evaluation of Buffer Management Strategies for Relational Database Systems"

A classic "Wisconsin-style" database paper, with conclusions backed up by extensive simulation results. A database uses buffers, to replicate fetched disk pages in memory and therefore to speed access to the contents of those pages that show certain patterns of "locality." Chou and DeWitt start from the observation that buffer pool page replacement policies are not one-size-fits-all, and show that buffer pool sizes and replacement policies can be tuned to certain well-defined patterns of access to the underlying relations.

Shapiro, "Join Processing in Database Systems with Large Main Memories" [local]

One of the most expensive operations ina database system is the calculation of "joins" between two or more different relations. But even for a particular join in a query plan, there can be several different options for how the join is to be computed, depending on how large the underlying relations are (or are expected to be), whether or not there exist indices on the relation (and whether the relation is clustered according to one of those indices), and how much main memory is available. Shapiro shows that which join algorithm is optimal in different cases can be computed from basic information about each of these situations -- memory, relation size, and status of indices. (Note: as a fun exercise, compare the contents of this paper to that of the paper referenced within as #6.)

Mohan, Haderle, Lindsay, Pirahesh, & Schwarz, "ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging" [local]

This paper is so complicated and hard to read -- and take a look at the last page, on how long it took between submission, acceptance, and publication (apparently it was hard to review, too!). But the ARIES algorithm turns out to be a pretty comprehensive method for using UNDO and REDO logs to carefully handle failures and recovery. It even handles recovery from failures during an earlier recovery process! Also included are a discussion of the differences between physical and logical logging, and a careful discussion of checkpointing in the logs themselves.

Franklin, "Concurrency Control and Recovery" [local]

An overview of concurrency control techniques, including locking.

Kung & Robinson, "On Optimistic Methods for Concurrency Control" [local]

The paper that introduced "Optimistic Concurrency Control," which is a way of "handling" concurrency by (basically) not handling it. That is, it assumes that most transactions won't conflict, and so it (in general) doesn't maintain locks. When transactions do conflict, they are aborted completely. It does all this by dividing transactions into Read, Validate, and Write phases, and using monotonically-increasing transaction numbers to detect when the same phase of different transactions overlap. In some of these cases, serialization of the underlying transactions can't be guaranteed, and so they have to be rolled back. There are two forms of OCC presented in this paper, "serial validation" and "parallel validation."

Gray, Lorie, Putzolu, & Traiger, "Granularity of Locks and Degrees of Consistency in a Shared Database"
Gray, "The Transaction Concept: Virtues and Limitations"
Eswaran et al., "The Notions of Consistency and Predicate Locks in a Database System"

Jim Gray was one of the pioneers in thinking about concurrency and distributed systems. The first paper was the only one of the three that was assigned as reading in class. That paper reviews the notion of lock granularity (table level, row level, item level), and covers the idea of hierarchies of locks. It also discusses the "phantom problem," and talks about different kinds of consistency between transactions.

Beckmann, Kriegel, Schneider, & Seeger, "The R*-tree: An Efficient and Robust Access Method for Points and Rectangles"
Guttman, "R-Trees: A Dynamic Index Structure for Spatial Searching"

The "standard" indices in relational systems are hash indices and B-trees; but, if you know that your database is storing specialized data which can be indexed in a specific way, building that specific index type in your database system can improve query performance.

Harizopoulos, Abadi, Madden, & Stonebraker, "OLTP Through the Looking Glass, and What We Found There"

A recent look at where modern databases spend their computation time -- results in the observation that much of the work of a modern RDBMS goes into the "overhead" of buffer-management, logging, and locking, with fewer than 5% of the processor time going to "useful work." In class, Stonebraker presented these conclusions as, "if we're going to make any progress, we can't look for fractional improvements; we need to completely drop some or all of these requirements." This led directly into a discussion of shared-nothing distributed systems, and of storage schemes like the one in C-Store.

Search Engines and Data Systems

Brewer, "Combining Systems and Databases: A Search Engine Retrospective"

Brewer was the guy behind Inktomi, the search engine used (for a while) by Yahoo. In this paper, he points out that the a web-crawler/indexer/search-engine combination can be thought of as populating and scanning a particular relational data model. However, the particular features of this model (and the query workload of a search engine) means that actually implementing the relational model using a relational database is sub-optimal; Brewer even claims that real-world implementations using distributed databases ran an order-of-magnitude slower than later custom implementations. Therefore, this paper was used in class as an entre to the notion that certain kinds of data systems (models and query workloads) are not optimally served by existing database technology. This led to the topic of the second half of the course, which was how we might go about designing such custom data systems.

Stonebraker et al., "C-Store: A Column-Oriented DBMS"

We spent a lot of time in class thinking about the C-Store, and related, papers. For relational datasets which are "wide" (in the sense of having lots of columns), and "sparse" (in the sense that every data item has a lot of NULLs among its fields), it makes sense to completely rethink how to store the data on disk. Classical RDBMSs are "row-stores," in that they store each row (or "tuple") in a single location, and a relation (or "table") is a set of tuples that are stored together and indexed in some way. For analytic processing loads that only touch a few columns of a wide, sparse dataset, this is completely suboptimal; this leads to the notion of a "column store," which stores single columns of information together, along with additional data structures that let those columns be joined back together into the original tuples. C-Store is a paper about a particular implementation of one of these systems. It has since been commercialized as the "Vertica" database product. Vertica's VP of technology actually came to our class to give the lecture on this particular paper (and on Vertica, specifically).

Marcus, "BlendDB: Blending Table Layouts to Support Efficient Browsing of Relational Databases"
Abadi, Marcus, Madden & Hollenbach, "Scalable Semantic Web Data Management Using Vertical Partitioning"
Weiss, Karras, & Bernstein, "Hexastore: Sextuple Indexing for Semantic Web Data Management"

Adam Marcus was one of the TA's for our class -- the first paper was his Master's thesis. It was extended and became the vertical partitioning paper; both of these papers cover particular ways of storing graph-structured data in a relational system, so that common queries for related items can be answered quickly. The Hexastore paper, from a different group (not at MIT) extended the vertical partitioning paper to its (they claim) logical conclusion. The Hexastore approach stores graph data in a relational structure and then indexes it in six different ways, such that any path pattern through the data can be efficiently queried using the indices. These three papers should probably be read as a unit.

He & Singh, "Graphs-at-a-time: Query Languages and Access Methods for Graph Databases"

Not assigned in class, but suggested by Prof. Madden for our class project. It outlines a graph query language (more powerful than SparQL, but maybe comparable to something like SerQL or GraphQL), and discusses the corresponding notions of query plans and query optimization over a graph store.

Parallel Databases

Mohan, Lindsay, & Obermarck, "Transaction Management in the R* Distributed Database Management System"

This is a paper that covers "classical" parallel database design, with sets (possibly hierarchical sets) of distributed processors running distributed transactions with shared resources. Given the distributed nature of the transactions, the processors need to be able to either uniformly commit or uniformly abort certain transactions. This paper covers the two-phase commit (2P) protocol for ensuring that transactions are uniformly committed (or aborted) over distributed systems of processors. It then continues to discuss the Presumed Commit (PC) and Presumed Abort (PA) variants of the 2P protocol. (I found this paper confusing, especially the distinction between the normal and "recovery" processes running at each node. Ted's tip for reading this was, "Different nodes can be the coordinators for different transactions. When a node comes back up from failure, it doesn't know whether it was the coordinator for a particular transaction or not, and you need to keep that in mind when reading about the states that a particular node could be in." That made a lot of sense to me.)

DeWitt & Gray, "Parallel Database Systems: The Future of High Performance Database Processing"

Mostly a high-level review of different kinds of parallel architectures.

Stonebraker, Madden, Abadi, Harizopoulos, Hachem, & Helland, "The End of an Architectural Era (It's Time for a Complete Rewrite)"

Discusses the problem with "one-size-fits-all" OLTP architectures, and outlines an example architecture (the H-Store system) that tries to avoid many of these legacy problems with a complete re-write of some key database features.

Google and Shared-Nothing Systems

Ghemawat, Gobioff, & Leung, "The Google File System"

We didn't actually discuss this paper directly in class, but it came up indirectly several times. The GFS provides a building block for several of the other services that Google uses, including BigTable (the SSTables and MemTables are stored on GFS, as well as the logs for the BigTable itself) and MapReduce (the results of the map tasks are stored on GFS).

Chang et al., "Bigtable: A Distributed Storage System for Structured Data"

One of the Google systems we studied in class was "BigTable," a system for providing distributed storage of, and access to, tabular-formatted data ("sort of" relational). Depends (as does MapReduce, below) on the Google File System (GFS).

Dean & Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters"

The MapReduce paper, source of so much additional discussion!

Chandra, Griesemer, & Redstone, "Paxos Made Live - An Engineering Perspective"

Not an assignment for class -- but the BigTable systems uses a system called "Chubby" to handle locks and coordination, and the Chubby server runs a version of the Paxos protocol. This paper is about the particular difficulties that the Google team encountered, when they went to translate the research papers on Paxos into a workable implementation.

Dataflow Languages and Adaptive Query Processing

Abadi et al., "Aurora: a new model and architecture for data stream management"

The Aurora paper was probably the "stream language" paper on which we spent the most time in class. The paper outlines a system which streams data through a directed graph of stream operators, many of which resemble classical relational operators. Some of the operators (such as "aggregate") can have custom code inserted in them. Special attention was paid to the adaptive optimization abilities of the system and to the functioning of the "order statements." These last are a kind of "type" statement, which is carried along through the stream, and which is used to guarantee that streams of data entering certain operators have certain ordering and sorting characteristics. The system also has the ability to do dynamic "load shedding," in the case of temporarily poor performance. I also thought it was particularly interesting, how the system uses gradient descent (essentially) on user-specified QoS graphs, in order to figure out how to dynamically optimize parts of the stream computation.

Babu & Bizarro, "Adaptive Query Processing in the Looking Glass"

An overview of different approaches to adaptive query processing -- that is, reconfiguring characteristics (or even the structure of the query plan itself) based on up-to-date statistics on the data source and its execution. Special attention as paid, in class, to reference #28, by Kabra and DeWitt.

Avnur & Hellerstein, "Eddies: Continuously Adaptive Query Processing"

A system for doing automatic query optimization using "eddies," which are adaptive query routers. Each eddy receives a stream of tuples, and is responsible for routing tuples through a set of operations (where the operations themselves should be able to be freely reordered), and then outputting the remaining tuples which have made it through all the operations. The eddy works by keeping track of which tuples have seen which operations, as well as statistics on the speed and selectivity of particular operations. It then dynamically reorders the operations based on their performance characteristics. The ultimate conclusion in class on this paper was, "interesting, but too slow to be workable in practic."

Dobbins et al., "Morpheus 2.0: A Data Transformation Management System"

Not assigned in class -- one of Michael Stonebraker's latest projects.