Julian Shun   信哲文

Julian Shun   信哲文

Associate Professor
Electrical Engineering and Computer Science
Computer Science and Artificial Intelligence Laboratory
Massachusetts Institute of Technology

Office: 32-G736
Email: jshun at mit.edu

I am an associate professor at MIT in the EECS department and a principal investigator in CSAIL. Prior to that, I was a Miller Research Fellow at UC Berkeley working with Michael Mahoney. I obtained my Ph.D. from Carnegie Mellon University, and was advised by Guy Blelloch.

Research Interests

I am interested in the theory and practice of parallel and high-performance computing. I work on designing efficient algorithms and data structures with strong theoretical guarantees and good empirical performance. I also work on designing high-level programming frameworks that make it easier for programmers to write efficient parallel code. Recent domains of interest include graph analytics, computational geometry, and string processing. Below is a description of my recent projects.

Large-scale Graph Processing

I am very interested in developing algorithms for large-scale graph processing. Graph algorithms have many applications, ranging from analyzing social networks to finding patterns in biological networks. We have developed Ligra, a lightweight graph processing framework for shared memory. The project was motivated by the fact that the largest publicly available real-world graphs all fit in shared memory. When graphs fit in shared memory, processing them using Ligra can give performance improvements of up to orders of magnitude compared to distributed-memory graph processing systems. We have also developed Ligra+, an extension of Ligra that uses graph compression techniques to process large graphs with less memory. We have used Ligra/Ligra+ to design and evaluate parallel algorithms for graph eccentricity estimation as well as local graph clustering. In our framework Julienne, we extended Ligra to support algorithms that use bucketing to improve performance. We have built the Aspen framework for streaming graph processing that supports graph updates and queries concurrently with low latency. Recently, we extended Ligra to support efficient parallel hypergraph algorithms. We have designed the Sage graph processing system that supports efficient graph algorithms on non-volatile main memory (NVRAM). We introduce the Parallel Semi-Asymmetric Model (PSAM) for NVRAM graph algorithms, in which the graph is stored as a read-only data structure (in NVRAM), and the amount of mutable memory is kept proportional to the number of vertices. We have designed the GraphIt domain specific language (DSL) for graph computations where programmers specify the algorithm using an algorithm language, and performance optimizations are specified using a separate scheduling language. The algorithm language simplifies the expression of algorithms, while exposing opportunities for optimizations. The scheduling language enables programmers to easily search a large space of optimizations to find the combinations of optimizations that give the best performance. We have developed practical algorithms with strong theoretical guarantees for many fundamental graph algorithms, such as connected components, minimum spanning forest, triangle computations, maximum flow, maximal independent set, maximal matching, and butterfly computations. Many of these algorithms have been integrated into our Graph Based Benchmark Suite (GBBS). We have also designed the Kaskade system that automatically derives materialized graph views to speed up graph query evaluation.

Computational Geometry

I am interested in designing parallel algorithms for problems in computational geometry that are both theoretically-efficient and practical. We have developed state-of-the-art parallel algorithms for DBSCAN (density-based spatial clustering of applications with noise), hierarchical DBSCAN, and Euclidean minimum spanning tree, which are scalable and significantly faster than prior work. We have also designed the first parallel batch-dynamic data structure for maintain the closest pair, and presented the first experimental evaluation of parallel closest pair algorithms.

Parallel In-Place Algorithms

I am interested in designing parallel algorithms that are in-place, in that they use space sublinear in the input size. In-place algorithms not only reduce memory usage, which is important for processing large data sets, but can also improve locality and performance by reducing data movement.

Write-efficient Algorithms

I am interested in designing efficient parallel algorithms for memories with asymmetric costs of reading and writing (e.g., NVRAM). We have developed models for accounting for read-write asymmetry. Using the models, we have designed write-efficient algorithms for a number of primitives including sorting, filter, reduce, Fast Fourier transform, breadth-first search, list ranking, tree contraction, minimum spanning forest, and planar convex hull.

Transaction Processing

We have designed Chiller, a new approach to data partitioning and transaction execution, which aims to minimize data contention for both local and distributed transactions.

Deterministic Parallel Programming

I am interested in developing tools that many it easier for others to do parallel programming. In particular, we have developed algorithms, data structures and tools for deterministic parallel programming. Determinism is very important in parallel programming as it allows for ease of debugging, reasoning about correctness and performance.

Parallel String/Text Algorithms and Data Structures

We have developed practical parallel algorithms with theoretical guarantees for several important algorithms and data structures in string/text processing. These have important applications in bioinformatics, data compression, and information retrieval.

Parallel Semisorting

We have designed an efficient parallel algorithm for semisorting (where equal-valued keys are contiguous but different keys are not necessarily in sorted order). This is a useful primitive in many applications, for example in performing database joins and in performing the shuffle phase in the MapReduce paradigm.

Problem Based Benchmark Suite (PBBS)

We have developed a benchmark suite of fundamental problems along with parallel algorithms for solving them. The benchmarks are solely defined by the problem specification and input/output formats, without any reference to the algorithm, programming language or machine used. Please contribute your own implementations to the benchmark suite! The following paper contains descriptions of the benchmarks and experiments using them:

Ph.D. Thesis