

Julian Shun
Assistant Professor
Douglas T. Ross Career Development Professor of Software Technology
Electrical Engineering and Computer Science
Massachusetts Institute of Technology
Office: 32G736
Phone: (617) 2580669
Email: jshun at mit.edu
Address
C.V.


I am an assistant professor at MIT in
the
EECS department. Prior to
that, I was a
Miller Research Fellow at UC Berkeley working with Michael Mahoney. I obtained
my Ph.D. from Carnegie Mellon University, and was advised by Guy Blelloch.
Research Interests
I am interested in the theory and practice
of parallel computing, especially parallel graph processing
frameworks, algorithms, data structures, and tools for deterministic
parallel programming. Below is a description of my recent projects.
I am looking for students and postdocs interested in the theory and practice of parallel algorithms, data structures, and programming.
Ph.D. Thesis
Largescale Graph Processing
I am very interested in
developing algorithms for largescale graph processing. Graph
algorithms have many applications, ranging from analyzing social
networks to finding patterns in biological networks.
I have developed
Ligra, a lightweight graph
processing framework for shared memory. The project was motivated by
the fact that the largest publicly available realworld graphs all fit
in shared memory. When graphs fit in shared memory, processing them
using Ligra can give performance improvements of up to orders of
magnitude compared to distributedmemory graph processing systems. I
have also developed Ligra+, an extension of Ligra that uses graph
compression techniques to process large graphs with less
memory. Recently, I have used Ligra/Ligra+ to design and evaluate
parallel algorithms for graph eccentricity estimation as well as local graph clustering.
 Laxman Dhulipala, Guy Blelloch, and Julian Shun
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable
Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pp. 393404, 2018.
Best Paper Award
 Laxman Dhulipala, Guy Blelloch, and Julian Shun
Julienne: A Framework for Parallel Graph Algorithms using
Workefficient Bucketing
Proceedings of the ACM Symposium on Parallelism in
Algorithms and Architectures (SPAA), pp. 293304, 2017.
 Julian Shun, Farbod RoostaKhorasani, Kimon Fountoulakis, and Michael Mahoney
Parallel Local Graph Clustering
Proceedings of the VLDB Endowment, 9(12), pp. 10411052, 2016.
 Julian Shun
An Evaluation of Parallel Eccentricity Estimation Algorithms on Undirected RealWorld Graphs
Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), pp. 10951104, 2015.
 Julian Shun, Laxman Dhulipala,
and Guy
Blelloch
Smaller and Faster: Parallel Processing of Compressed Graphs with Ligra+
Proceedings of the IEEE Data Compression Conference (DCC), pp. 403412, 2015.
 Julian Shun and Guy Blelloch
Ligra: A Lightweight
Graph Processing Framework for Shared Memory
Proceedings of the ACM SIGPLAN Symposium on Principles and
Practice of Parallel Programming (PPoPP), pp. 135146, 2013.
Webpage for Ligra/Ligra+
Source code
I have also developed practical algorithms with strong theoretical
guarantees for many fundamental graph algorithms, such as connected
components, minimum spanning forest, triangle computations, maximum
flow, maximal independent set, and maximal matching. I have
implemented algorithms for the shared memory multicore setting and
also for external memory.

Niklas Baumstark, Guy Blelloch, and Julian Shun
Efficient Implementation of a Synchronous Parallel PushRelabel Algorithm
Proceedings of the European Symposium on Algorithms (ESA), pp. 106117, 2015.
 Julian Shun and Kanat Tangwongsan
Multicore Triangle Computations Without Tuning
Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp. 149160, 2015.
Source code
 Julian Shun, Laxman Dhulipala, and Guy Blelloch
A Simple
and Practical LinearWork Parallel Algorithm for Connectivity
Proceedings of the ACM Symposium on Parallelism in Algorithms and
Architectures (SPAA), pp. 143153, 2014.
Source code
 Aapo Kyrola, Julian Shun, and Guy Blelloch
Beyond
Synchronous: New Techniques for External Memory Graph
Algorithms
Proceedings of the International Symposium on Experimental Algorithms (SEA),
pp. 123137, 2014.
Parallel String/Text Algorithms and Data Structures
I have
developed practical parallel algorithms with theoretical guarantees
for several important algorithms and data structures in string/text
processing. These have important applications in bioinformatics, data
compression, information retrieval among many others.
 Julian Shun
Improved Parallel Construction of Wavelet Trees and Rank/Select Structures
Proceedings of the IEEE Data Compression Conference (DCC), pp. 92101, 2017.
 Julian Labeit, Julian Shun,
and Guy
Blelloch
Parallel lightweight wavelet tree, suffix array and FMindex construction
Journal of Discrete Algorithms, 2017.
Special issue of DCC 2016
Source code
 Julian Shun
Parallel Wavelet Tree Construction
Proceedings of the IEEE Data Compression Conference (DCC), pp. 6372, 2015.
Awarded the Capocelli Prize for Best StudentAuthored Paper
Source code
 Julian Shun
Fast Parallel
Computation of Longest Common Prefixes
Proceedings of the ACM/IEEE International Conference for High
Performance Computing, Networking, Storage and Analysis (SC),
pp. 387398, 2014.
 Julian Shun and Guy Blelloch
A Simple Parallel
Cartesian Tree Algorithm and its Application to Parallel Suffix Tree
Construction
ACM Transactions on Parallel Computing (TOPC), Vol. 1 Issue 1,
Article No. 8, 2014.
Source code
 Julian Shun and Fuyao Zhao (joint first author)
Practical Parallel
LempelZiv Factorization
Proceedings of the IEEE Data Compression Conference (DCC),
pp. 123132, 2013.
Source code
Deterministic Parallel Programming
I am interested in
developing tools that many it easier for others to do parallel
programming. In particular, I have developed algorithms, data
structures and tools for deterministic parallel
programming. Determinism is very important in parallel programming as
it allows for ease of debugging, reasoning about correctness and
performance.
 Guy Blelloch, Yan Gu, Julian Shun, and Yihan Sun
Parallelism in Randomized Incremental Algorithms
Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pp. 467478, 2016.
 Julian Shun, Yan Gu, Guy Blelloch, Jeremy Fineman, and Phillip Gibbons
Sequential Random
Permutation, List Contraction and Tree Contraction are Highly
Parallel
Proceedings of the ACMSIAM Symposium on Discrete
Algorithms (SODA), pp. 431448, 2015.
 Julian Shun and Guy Blelloch
Phaseconcurrent Hash
Tables for Determinism
Proceedings of the ACM Symposium on Parallelism in Algorithms and
Architectures (SPAA), pp. 96107, 2014.
 Julian Shun, Guy Blelloch, Jeremy Fineman, and Phillip Gibbons
Reducing Contention
Through Priority Updates
Proceedings of the ACM Symposium on Parallelism in Algorithms and
Architectures (SPAA), pp. 152163, 2013.
 Guy Blelloch, Jeremy Fineman, and Julian
Shun
Greedy Sequential Maximal
Independent Set and Matching are Parallel on Average
Proceedings of the ACM Symposium on Parallelism in Algorithms and
Architectures (SPAA), pp. 308317, 2012.
MIS Source Code
Maximal Matching Source Code
 Guy Blelloch, Jeremy Fineman,
Phillip Gibbons, and Julian Shun
Internally
Deterministic Parallel Algorithms Can Be Fast
Proceedings of the ACM SIGPLAN Symposium on Principles and
Practice of Parallel Programming (PPoPP), pp. 181192, 2012.
Website
Writeefficient Algorithms
I am interested in designing
efficient parallel algorithms for memories with asymmetric costs of
reading and writing (e.g., NVRAM). I have developed models for
accounting for readwrite asymmetry. Using the models, I have
designed writeefficient algorithms for a number of primitives
including sorting, filter, reduce, Fast Fourier transform,
breadthfirst search, list ranking, tree contraction, minimum spanning
forest, and planar convex hull.
 Guy Blelloch, Phillip Gibbons, Yan Gu, Charles McGuffey, and Julian Shun
The Parallel Persistent Memory Model
Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pp. 247258, 2018.
 Guy Blelloch, Yan Gu, Julian Shun, and Yihan Sun
Parallel WriteEfficient Algorithms and Data Structures for Computational Geometry
Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pp. 235246, 2018.
 Naama BenDavid, Guy Blelloch, Jeremy Fineman, Phillip Gibbons, Yan Gu, Charles McGuffey, and Julian Shun
Implicit Decomposition for WriteEfficient Connectivity Algorithms
Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 711722, 2018.
 Guy Blelloch, Jeremy Fineman, Phillip Gibbons, Yan Gu, and Julian Shun
Efficient Algorithms with Asymmetric Read and Write Costs
Proceedings of the European Symposium on Algorithms (ESA), pp. 14:114:18, 2016.
 Naama BenDavid, Guy Blelloch, Jeremy Fineman, Phillip Gibbons, Yan Gu, Charles McGuffey, and Julian Shun
Parallel Algorithms for Asymmetric ReadWrite Costs
Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pp. 145156, 2016.
 Guy Blelloch, Jeremy Fineman, Phillip Gibbons, Yan Gu, and Julian Shun
Sorting with Asymmetric Read and Write Costs
Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pp. 112, 2015.
Parallel Semisorting
I have designed an efficient parallel
algorithm for semisorting (where equalvalued keys are contiguous but
different keys are not necessarily in sorted order). This is a useful
primitive in many applications, for example in performing database
joins and in performing the shuffle phase in the MapReduce paradigm.
 Yan Gu,
Julian Shun, Yihan Sun,
and Guy
Blelloch
A TopDown Parallel Semisort
Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pp. 2434, 2015.
Problem Based Benchmark Suite (PBBS)
I have developed
a
benchmark
suite of fundamental problems along with parallel algorithms for
solving them. The benchmarks are solely defined by the problem
specification and input/output formats, without any reference to the
algorithm, programming language or machine used. Please contribute
your own implementations to the benchmark suite! The following paper
contains descriptions of the benchmarks and experiments using them: