navigation
I am a graduate student at the MIT Computer Science and Artificial
Intelligence Lab (CSAIL), where I am part of the Database group and the
Haystack group. My
advisors are Sam Madden and David
Karger.
Research Projects:
Research Projects:
- BlendDB: A Relational Database that Supports Efficient Web Browsing Queries - Best practices in relational database schema design result in multiple tables, each describing a different real-world entity. For example, a movies database would contain Movie, Actor, and Director tables. When displaying the data to a user browsing such content, the resulting query workload tends to access small portions of several tables. Continuing the example, generating the page for a movie on IMDB would require pulling information on the movie, the actors in the movie, the directors of the movie, and perhaps some ratings data. Because traditional databases store these tables in different locations on disk, a user's browsing session results in many random disk I/Os, leading to poor throughput. BlendDB blends the underlying heapfiles of the various tables so that queries can find related records from each table near each other on disk, improving the throughput of browsing-oriented queries. This work was done with David Karger and Samuel Madden at MIT. Relevant Documents:
- Scalable Semantic Web Triple Stores - The semantic web is a concept that was proposed by the W3C and is being used in a growing number of applications, including biology and libraries. While the potential uses for semantic web technologies are readily apparent, an essential step in realizing the semantic web vision is making systems to store, index, and query semantic web data (RDF). We are researching and benchmarking methods of storing semantic web data in a relational database efficiently. This work was done with Daniel Abadi, Kate Hollenbach, and Samuel Madden at MIT. Relevant Documents:
- Collaborative Information Organization - Information workers such as scientists often generate data and findings that they wish to share with their collaborators. They then annotate their data to add useful metadata which describes their findings. As an undergraduate at RPI, I helped create an initial prototype of a system that allows metamorphic pretrologists to collaborate and annotate their findings in this way. The project continues and has grown, but while I was at RPI, my collaborators were Sibel Adali, Boleslaw Szymanski, Frank Spear, and Bouchra Bouqata. Relevant Documents:
- Effective Web-Scale Crawling - Web Crawlers are used to traverse the web and find new or updated content to be indexed by search engines and other organizations. I spent a summer internship at IBM's Almaden Research Center working on making the WebFountain web crawler "intelligent" so that it could prioritize websites to crawl and recrawl. Collaborators included Roberto Bayardo, David Blackman, Ian Bergman, Ivan Gonzalez, Daniel Meredith, and Linda Nguyen. Relevant Document: