Shoaib Kamil
(skamil AT csail DOT mit DOT edu)
Research Interests
Scientific computing, parallel programming languages, software synthesis, programming systems for parallel productive programming, software engineering, auto-tuning, embedded DSLs, power-efficient parallel computing, software as a service (SaaS)
I am currently a research scientist at MIT CSAIL, working on the D-TEC X-STACK project and with Prof. Saman Amarasinghe and Prof. Armando Solar Lezama.
I completed my PhD in December 2012, and was co-advised by Prof. Armando Fox and Prof. Kathy Yelick, working with the BeBOP Group in the Parallel Computing Laboratory. I was previously affiliated with the Future Technologies Group at LBNL.
Projects
Asp (Asp is SEJITS for Python) - an implementation of Selective Embedded Just-in-Time Specialization for Python, which bridges the gap between productivity and performance using domain-specific embedded compilers. Asp's goal is to simplify the creation of DSLs in Python, and enable expert programmers in a domain (who are not language experts) to write DSLs or auto-tuned libraries appropriate for their domain. Current results show non-expert programmers can utilize these DSLs and auto-tuned libraries to meet or beat state-of-the-art hand-tuned low-level code, while still writing in a high-level productive language.
Stanza Triad - a modified version of STREAM Triad that tests the effectiveness of prefetch engines. Download v. 0.4
Stencil Probe - small easily-modifiable probe for simulating behavior of stencil applications. used as a testbed for evaluating optimizations for stencil codes.
Teaching
6.005: Software Construction, Spring 2012. Co-lecturer with Saman Amarasinghe and Max Goldman.
CS169: Software Engineering, Fall 2010 (Instructor: Armando Fox)
CS267: Applications of Parallel Computers, Fall 2008 (Instructor: Horst Simon)
CS164: Compilers and Programming Languages, Fall 2002 (Instructor: Richard Fateman)
CS170: Efficient Algorithms and Intractable Problems, Spring 2001 (Instructors: James Demmel and Jonathan Shewchuk)
Publications
PhD Dissertation
Productive High Performance Parallel Programming with Auto-tuned Domain-Specific Embedded Languages
PhD Dissertation, EECS Dept, University of California, Berkeley (Tech Report EECS-2012-255), 2012
Peer-Reviewed Publications
Communication-Optimal Parallel Recursive Rectangular Matrix Multiplication
James Demmel, David Eliahu, Armando Fox, Shoaib Kamil, Benjamin Lipshitz, Oded Schwartz, Omer Spillinger
International Parallel and Distributed Processing Symposium (IPDPS), to appear, 2013
High-Productivity and High-Performance Analysis of Filtered Semantic Graphs
Aydin Buluc, Erika Duriakova, Armando Fox, John Gilbert, Shoaib Kamil, Adam Lugowski, Leonid Oliker, Samuel Williams
International Parallel and Distributed Processing Symposium (IPDPS), to appear, 2013
Auto-tuning the Matrix Powers Kernel with SEJITS
Jeffrey Morlan, Shoaib Kamil, Armando Fox
Seventh International Workshop on Automatic Performance Tuning (iWAPT), 2012
Parallel High Performance Statistical Bootstrapping in Python
Aakash Prasad, David Howard, Shoaib Kamil, Armando Fox
Scientific Computing with Python Conference, 2012
Portable Parallel Performance from Sequential, Productive, Embedded Domain Specific Languages
S. Kamil, D. Coetzee, S. Beamer, H. Cook, E. Gonina, J. Harper, J. Morlan, A. Fox
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), Extended Abstract, 2012
Bringing Parallel Performance to Python with Domain-Specific Selective Embedded Just-in-Time Specialization
Shoaib Kamil, Derrick Coetzee, Armando Fox
10th Python for Scientific Computing Conference, 2011
CUDA-level Performance with Python-level Productivity for Gaussian Mixture Model Applications
H. Cook, E. Gonina, S. Kamil, G. Friedland, D. Patterson, A. Fox
USENIX Workshop on Hot Topics in Parallelism (HotPar), 2011
Hardware/Software Co-design of Global Cloud System Resolving Models
M. F. Wehner, L. Oliker, J. Shalf, D. Donofrio, L. A. Drummond, R. Heikes, S. Kamil, C. Kono, N. Miller, H. Miura, M. Mohiyuddin, D. Randall, W.-S. Yang
Journal of Advances in Modeling Earth Systems, 2011
Silicon Nanophotonic Network-On-Chip Using TDM Arbitration
G. Hendry, J. Chan, S. Kamil, L. Oliker, J. Shalf, L. P. Carloni, K. Bergman
IEEE Symposium on High Performance Interconnects (HOTI), 2011
An Auto-tuning Framework for Parallel Multicore Stencil Computations
Shoaib Kamil, Cy Chan, Leonid Oliker, John Shalf, Samuel Williams
IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2010
SEJITS: Getting Productivity and Performance with Selective Embedded JIT Specialization
Bryan Catanzaro, Shoaib Kamil, Yunsup Lee, Krste Asanovic, James Demmel, Kurt Keutzer, John Shalf, Kathy Yelick, Armando Fox
Workshop on Programming Models for Emerging Architectures (PMEA), 2009
A Generalized Framework for Auto-tuning Stencil Computations
Shoaib Kamil, Cy Chan, Sam Williams, Leonid Oliker, John Shalf, Mark Howison, E. Wes Bethel, Prabhat
Cray User Group Conference, 2009
Best Paper Award
Analysis of Photonic Networks for a Chip Multiprocessor Using Scientific Applications
Gilbert Hendry, Shoaib Kamil, A. Biberman, J. Chan, B. Lee, M. Mohiyuddin, A. Jain, K. Bergman, L. Carloni, J. Kubiatowicz, L. Oliker, J. Shalf
International Symposium on Networks-on-Chip (NOCS), 2009
Communication Requirements and Interconnect Optimization for High-End Scientific Applications
Shoaib Kamil, Leonid Oliker, Ali Pinar, John Shalf
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2009
Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors
Kaushik Datta, Shoaib Kamil, Sam Williams, Leonid Oliker, John Shalf, Katherine Yelick
SIAM Review, 2009
Power Efficiency in High Performance Computing
Shoaib Kamil, John Shalf, Erich Strohmaier
International Parallel and Distributed Processing Symposium, 2008
Performance and Energy Comparison of Electrical and Hybrid Photonic Networks for CMPs
Ankit Jain, Shoaib Kamil, Marghoob Mohiyuddin, John Shalf, John Kubiatowicz
High Performance Embedded Computing (HPEC), 2008
Reconfigurable Hybrid Interconnection for Static and Dynamic Scientific Applications
Shoaib Kamil, Ali Pinar, Daniel Gunter, Michael Lijewski, Leonid Oliker, John Shalf
ACM International Conference on Computing Frontiers, 2007
Scientific Application Performance on Candidate PetaScale Platforms
Leonid Oliker, Andrew Canning, Jonathan Carter, Costin Iancu, Michael Lijewski, Shoaib Kamil, John Shalf, H. Shan, Erich Strohmaier, Stephane Ethier, Tim Goodale
International Parallel and Distributed Processing Symposium (IPDPS), 2007
Best Paper Award
Scientific Computing Kernels on the Cell Processor
Samuel Williams, John Shalf, Leonid Oliker, Shoaib Kamil, Parry Husbands, Katherine Yelick
International Journal of Parallel Programming (IJPP), 2007
Implicit and Explicit Optimizations for Stencil Computations
Shoaib Kamil, Kaushik Datta, Samuel Williams, Leonid Oliker, John Shalf, Katherine Yelick
Memory Systems Performance and Correctness (MSPC), 2006
The Potential of the Cell Processor for Scientific Computing
Sam Williams, John Shalf, Parry Husbands, Shoaib Kamil, Leonid Oliker, Katherine Yelick
ACM International Conference on Computing Frontiers, 2006
Analyzing Ultra-Scale Application Communication Requirements for a Reconfigurable Hybrid Interconnect
John Shalf, Shoaib Kamil, Leonid Oliker, David Skinner
IEEE/ACM Supercomputing (SC), 2005
Understanding Ultra-Scale Application Communication Requirements
Shoaib Kamil, Leonid Oliker, John Shalf, David Skinner
IEEE International Symposium on Workload Characterization (IISWC), 2005
Impact of Modern Memory Subsystems on Cache Optimizations for Stencil Computations
Shoaib Kamil, Parry Husbands, Leonid Oliker, John Shalf, Katherine Yelick
ACM SIGPLAN Workshop on Memory Systems Performance (MSP), 2005
Performance Optimizations and Bounds for Sparse Matrix-Vector Multiply
Richard Vuduc, James W. Demmel, Katherine A. Yelick, Shoaib Kamil, Rajesh Nishtala, Benjamin Lee
IEEE/ACM Supercomputing (SC), 2002
Finalist, Best Student Paper
Automatic Performance Tuning and Analysis of Sparse Triangular Solve
Richard Vuduc, Shoaib Kamil, Jen Hsu, Rajesh Nishtala, James W. Demmel, Katherine A. Yelick
Workshop on Performance Optimization of High-level Languages and Libraries (POHLL), 2002
Best Student Paper, Best Presentation
Other Publications
Ubiquitous Dynamic Code Generation and Compilation on Future Computing Devices
Shoaib Kamil and Armando Fox
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Provocative Ideas Session, 2012
Energy-Efficient Computing for Extreme Scale Science
David Donofrio, Leonid Oliker, John Shalf, Michael Wehner, Chris Rowen, Jens Krueger, Shoaib Kamil, Marghoob Mohiyuddin
IEEE Computer Magazine, 2009
Invited Talks
Recent Results, Insights, and Lessons from Auto-tuning Three Motifs
Center for Scalable Application Development Software (CScADS), 2008
Bridging the Productivity-Performance Gap with Selective Embedded Just-in-Time Specialization
IEEE International Symposium on Embedded Multicore SoCs, 2012
SEJITS - Bridging the Productivity-Performance Gap
Workshop on Domain Specific Multicore Computing (DSMC) at ICCAD, 2012