
Contact Information
MIT CSAIL
32 Vassar St, 32-G864
Cambridge, MA 02139
E-mail:
Profile
Education
Ph.D., Electrical Engineering, University of Illinois Urbana-Champaign, 1979M.S., Electrical Engineering, Purdue University, 1975
B.S., Electrical Engineering, Purdue University, 1974
Biography
Joel Emer is a Professor of the Practice at MIT's Electrical Engineering and Computer Science Department (EECS) and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). He is also a Senior Distinguished Research Scientist at Nvidia in Westford, MA, where he is responsible for exploration of future architectures as well as modeling and analysis methodologies. Prior to joining NVIDIA, he worked at Intel where he was an Intel Fellow and Director of Microarchitecture Research. Previously he worked at Compaq and Digital Equipment Corporation (DEC).Dr. Emer has held various research and advanced development positions investigating processor micro-architecture and developing performance modeling and evaluation techniques. He has made architectural contributions to a number of VAX, Alpha and X86 processors and is recognized as one of the developers of the widely employed quantitative approach to processor performance evaluation. He has also been recognized for his contributions in the advancement of simultaneous multi-threading technology, analysis of the architectural impact of soft errors, memory dependence prediction, pipeline and cache organization, performance modeling methodologies and spatial architectures.
Dr Emer received a bachelor's degree with highest honors in electrical engineering in 1974, and his master's degree in 1975 -- both from Purdue University. He earned a doctorate under the direction of Profressor Edward Davidson in electrical engineering from the University of Illinois in 1979.
Dr. Emer holds over 25 patents and has published more than 60 papers.
For some more backgound on my career and some of my research philosoply, there is a talk I gave at the 2021 Young Architects Workshop.
Research Interests
- Accelerator Architectures for Sparse Computation
- Accelerator Architectures for Deep Learning
- Spatial Processing Architectures
- Parallel Processor Architectures
- Computer Memory Hierarchy Design
- Architecture and Analysis of Processor Reliability
- Performance Modeling Methodologies
- Programming Environments for FPGA-based Applications
Research Honors
- Paper "A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor" selected for SIGMICRO Test of Time Award, 2022.
- Paper "Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling" selected for SIGMICRO Distinguished Artifact Award, 2022.
- Paper "Casa: End-to-end quantitative security analysis of randomly mapped caches" selected as an honorable mention in the Top Picks in Hardware and Embedded Security, 2022.
- Recieved the IASED Lifetime Achievement Award for 2022.
- Paper "A 0.32–128 TOPS, Scalable Multi-Chip-Module-Based Deep Neural Network Inference Accelerator With Ground-Referenced Signaling in 16nm" selected as JSSC best paper, 2021.
- Elected a member of the National Academy of Engineering "For quantitative analysis of computer architecture and its application to architectural innovation in commercial microprocessors", 2020.
- Paper "Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture" selected for best paper award at Micro 2019 and as a ACM "Research Highlight" for 2020.
- Paper "Buffets: An Efficient and Composable Storage Idiom for Explicit Decoupled Data Orchestration" selected as an IEEE Micro "Top Picks in Computer Architecture" honorable mention for 2020.
- Paper "ExTensor: An Accelerator for Sparse Tensor Algebra" selected as an IEEE Micro "Top Picks in Computer Architecture" honorable mention for 2020.
- Paper "Hardware for Machine Learning: Challenges and Opportunities" selected as best invited paper at CICC for 2017.
- Paper "Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks" selected for IEEE Micro "Top Picks in Computer Architecture" for 2016.
- Paper "Data-Centric Execution of Speculative Parallel Programs" selected for IEEE Micro "Top Picks in Computer Architecture" honorable mention for 2016.
- Paper "A Scalable Architecture for Ordered Parallelism" selected for IEEE Micro "Top Picks in Computer Architecture" for 2015.
- Named to the Micro Hall of Fame, 2015.
- Paper "Using In-flight Chains to Build a Scalable Cache Coherence Protocol" selected for ACM Computing Reviews: Notable computing books and articles award for 2013.
- Paper "Triggered Instructions: A Control Paradigm for Spatially-Programmed Architectures" selected for IEEE Micro "Top Picks in Computer Architecture" for 2013.
- Recipient of University of Illinois Electrical and Computer Engineering Distinguished Alumni Award "For advancing the art of performance modeling and measurement of microarchitectures and for contributions to the design of leading-edge microprocessors", 2011.
- Paper "Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multitheading Processor" selected for ACM/SIGARCH - IEEE-CS/TCCA: Most Influential Paper Award, 2011.
- Named to the HPCA Hall of Fame, 2011.
- Recipient of Purdue University Outstanding Electrical and Computer Engineer Alumni Award, 2010.
- Recipient of ACM/IEEE-CS Eckert-Mauchly Award "For pioneering contributions to performance analysis and modeling methodologies; for design innovations in several significant industry microprocessors; and for deftly bridging research and development, academia and industry", 2009.
- Paper "Adaptive Insertion Policies for High Performance Caching" selected for IEEE Micro "Top Picks in Computer Architecture" for 2008.
- Named to the ISCA Hall of Fame, 2005.
- Named Fellow of the Association for Computing Machinery "For contributions to computer architecture and performance analysis.", 2004.
- Paper "Techniques to Reduce the Soft Error Rate of a High-Performance Microprocessor" selected for IEEE Micro "Top Picks in Computer Architecture" for 2004
- Named Fellow of the Institute of Electrical and Electronics Engineers "For contributions to computer architecture and quantitative analysis of processor performance", 2004.
- Paper "A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor" selected for IEEE Micro "Top Picks in Computer Architecture" for 2003.
- Paper "A Characterization of Processor Performance in the VAX-11/780" selected for reprint in "25 Years of the International Symposium on Computer Architecture", 1999.
Courses Taught
- 6.888 (now 6.812/6.825) - Hardware Architecture for Deep Learning
- 6.823 - Computer Systems Architecture
- 6.888 - Parallel and Heterogeneous Computer Architecture
- 6.S078 (now 6.175) - Constructive Computer Architecture
- 6.004 Computation Structure
Students
-
Tanner Andrulis (S.M./Ph.D., co-advised with Vivienne Sze)
-
Fares Elsabbagh (S.M./Ph.D., co-advised with Daniel Sanchez)
-
Xingran (Maggie) Du (S.M./Ph.D., co-advised with Daniel Sanchez)
-
Andrew Feldman (MEng., co-advised with Vivienne Sze)
-
Michael Gilbert (MEng., co-advised with Vivienne Sze)
-
Jaeyeon Won (S.M./Ph.D., co-advised with Saman Amarasinghe)
-
Yannan (Nellie) Wu (S.M./Ph.D., co-advised with Vivienne Sze)
-
Zi Yu (Fisher) Xue (S.M./Ph.D., co-advised with Vivienne Sze)
-
Yifan Yang (S.M./Ph.D., co-advised with Daniel Sanchez)
Alumni
-
Yu-Hsin Chen (Ph.D. 2018, co-advised with Vivienne Sze)
-
Elliot Flemming (Ph.D. 2013, co-advised with Arvind)
-
Matt Fox (MEng. 2015, co-advised with Vivienne Sze)
-
Michael Pellauer (Ph.D. 2010, co-advised with Arvind)
-
Hsin-Jung Yang (Ph.D. 2017, co-advised with Srini Devadas)
Publications
Books/Book Chapters
-
Efficient processing of deep neural networks.
Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, and Joel S. Emer.
Synthesis Lectures on Computer Architecture, 15(2):1--341, 2020.
Journal Publications
-
Theres plenty of room at the top: What will drive computer performance after moores law?,
Charles E. Leiserson, Neil C. Thompson, Joel S. Emer, Bradley C. Kuszmaul, Butler W. Lampson, Daniel Sanchez, and Tao B. Schardl.
Science, 368(6495), 2020. [web] -
How to evaluate deep neural network processors: Tops/w (alone) considered harmful,
V. Sze, Y. H. Chen, T. J. Yang, and J. S. Emer.
IEEE Solid-State Circuits Magazine, 12(3):28--41, 2020. [paper] -
A 0.32128 tops, scalable multi-chip-module-based deep neural network inference accelerator with ground-referenced signaling in 16 nm.
B. Zimmer, R. Venkatesan, Y. S. Shao, J. Clemons, M. Fojtik, N. Jiang, B. Keller, A. Klinefelter, N. Pinckney, P. Raina, S. G. Tell, Y. Zhang, W. J. Dally, J. S. Emer, C. T. Gray, S. W. Keckler, and B. Khailany.
IEEE Journal of Solid-State Circuits, 55(4):920--932, 2020.
(JSSC "Best Paper" of 2021) [paper] -
Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices
Y. Chen, T. Yang, J. Emer, and V. Sze.
IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 9(2):292-308, June 2019. [paper] -
Efficient Processing of Deep Neural Networks: A Tutorial and Survey.
Vivienne Sze; Yu-Hsin Chen; Tien-Ju Yang; Joel S. Emer,
Proceedings of the IEEE, Volume: 105, Issue: 12, December 2017. [paper] -
Using Dataflow to Optimize Energy Efficiency of Deep Neural Network Accelerators,
Yu-Hsin Chen, Joel Emer, Vivienne Sze,
IEEE Micro's Top Picks from the Computer Architecture Conferences, May/June 2017. [paper] -
Scavenger: Automating the Construction of Application-Optimized Memory Hierarchies,
Hsin-Jung Yang, Kermin Fleming, Felix Winterstein, Michael Adler, Joel Emer,
ACM Transactions on Reconfigurable Technology and Systems (TRETS) March 2017. [paper] -
Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks,
Yu-Hsin Chen, Tushar Krishna, Joel Emer, Vivienne Sze,
IEEE Journal of Solid State Circuits (JSSC), ISSCC Special Issue, Vol. 52, No. 1, pp. 127-138, January 2017. [paper] -
Unlocking Ordered Parallelism with the Swarm Architecture,
Mark C. Jeffrey, Suvinay Subramanian, Cong Yan, Joel Emer, Daniel Sanchez,
in IEEE Micro's Top Picks from the Computer Architecture Conferences, May/June 2016. [paper] -
Efficient Control and Communication Paradigms for Coarse-Grained Spatial Architectures,
Michael Pellauer, Angshuman Parashar, Michael Adler, Busha Ahasan, Randy Allmon, Neal Crago, Kermin Fleming, Mohit Gamhir, Aamer Jaleel, Tushar Krishna, Daniel Lustig, Stephen Maresh, Vladimir Pavlov, Rachid Rayess, Antonia Zhai and Joel Emer,
ACM Transactions on Computing Systems, May 2015. [paper] -
Efficient Spatial Processing Element Control via Triggered Instructions,
Angshuman Parashar, Michael Pellauer, Michael Adler, Bushra Ahsan, Neal Crago, Daniel Lustig, Vladimir Pavlov, Antonia Zhai, Mohit Gambhir, Aamer Jaleel, Randy Allmon, Rachid Rayess, Stephen Maresh, Joel Emer,
in IEEE Micro's Top Picks from the Computer Architecture Conferences, May 2014. [paper] -
Using In-flight Chains to Build a Scalable Cache Coherence Protocol,
Samantika Subramaniam, Simon C. Steely Jr., William Hasenplaugh, Aamer Jaleel, Carl Beckmann, Tryggve Fossum, Joel Emer,
ACM Transactions on Architecture and Code Optimization (TACO). Presented at International Conference on High Performance and Embedded Architectures and Compilers (HiPEAC), Vienna, Austria, January 2014. [paper] -
The Gradient-based Cache Replacement Algorithm,
William Hassenplaugh, Pritpal Ahuja, Aamer Jaleel, Simon Steely, Joel Emer,
ACM Transactions on Architecture and Code Optimization (TACO), Presented at International Conference on High Performance and Embedded Architectures and Compilers (HiPEAC), January 2012. [paper] -
A-Port Networks: Preserving the Timed Behavior of Synchronous Systems for Modeling on FPGAs,
Michael Pellauer, Muralidaran Vijayaraghavan, Michael Adler, Arvind, Joel Emer,
Transactions on Reconfigurable Technology and Systems (TRETS), Volume 2 Issue 3, September 2009.. [paper] -
Set-Dueling-Controlled Adaptive Insertion for High-Performance Caching,
Moinuddin K. Qureshi, Aamer Jaleel, Yale N. Patt, Simon C. Steely Jr., Joel Emer,
in IEEE Micro's Top Picks from the Computer Architecture Conferences, January 2008. [paper] -
Reducing the Soft-Error Rate of a High-Performance Microprocessor,
Christopher Weaver, Joel Emer, Shubendu Mukherjee, Steven Reinhart,
in IEEE Micro's Top Picks from the Computer Architecture Conferences, November 2004. [paper] -
Measuring Architectural Vulnerability Factors
Shubendu Mukherjee, Christopher Weaver, Joel Emer, Steven Reinhardt, Todd Austin,
in IEEE Micro's Top Picks from the Computer Architecture Conferences, November 2003. [paper] -
Simultaneous Multithreading: A Platform for Next-generation Processors,
Susan Eggers, Joel Emer, Henry Levy, Jack Lo, Rebecca Stamm, and Dean Tullsen,
IEEE Micro, September/October 1997. [paper] -
Converting Thread-Level Parallelism Into Instruction-Level Parallelism via Simultaneous Multithreading,,
Jack Lo, Susan Eggers, Joel Emer, Henry Levy, Rebecca Stamm, Dean Tullsen,
ACM Transactions on Computer Systems, August 1997. [paper] -
Performance Analysis of Mass Storage Service Alternatives for Distributed Systems,
K.K. Ramakrishnan, J.S. Emer,
IEEE Transactions on Software Engineering, February 1989. -
Design and Implementation of the VAX Distributed File Service,
W.G. Nichols, J.S. Emer,
Digital Technical Journal, June 1989. -
Performance of the VAX-11/780 Translation Buffer: Simulation and Measurement,
J. Emer and D.W. Clark,
ACM Transactions on Computer Systems, February 1985.
(Reprinted in Readings in Computer Architecture, 2000.
Conference Publications
-
Accelerating Sparse Data Orchestration via Dynamic Reflexive Tiling
T. Odemuyiwa, H. Asghari-Moghaddam, M. Pellauer, K. Hegde, P. Tsai, N. Crago, A. Jaleel, J. Owens, E. Solomonik, J. S. Emer, and C. Fletcher
In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2023. (to appear) -
The Sparse Abstract Machine
Olivia Hsu, Maxwell Strange, Ritvik Sharma, Jaeyeon Won, Kunle Olukotun, Joel S Emer, Mark Horowitz, Fredrik Kjølstad
In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2023. (to appear) -
WACO: Learning Workload-Aware Co-optimization of the Format and Schedule of a Sparse Tensor Program
Jaeyeon Won, Charith Mendis, Joel S. Emer, Saman Amarasinghe
In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2023. (to appear) -
ISOSceles: Accelerating Sparse CNNs through Inter-Layer Pipelining
Yifan Yang, Joel S. Emer, Daniel Sanchez
In Proceedings of the 29th international symposium on High Performance Computer Architecture (HPCA-29), February 2023. (to appear) -
Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling
Yannan Nellie Wu, Po-An Tsai, Angshuman Parashar, Viviennne Sze, Joel Emer
In 55th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), October 2022.
(SIGMICRO Distinguished Artifact Award) [paper] -
Ruby: Improving Hardware Efficiency for Tensor Algebra Accelerators Through Imperfect Factorization
Mark Horeni, Pooria Taheri, Po-An Tsai, Angshuman Parashar, Joel Emer, Siddharth Joshi
In 2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), May 2022. [paper] -
DAGguise: Mitigating Memory Timing Side Channels
Peter W. Deutsch*, Yuheng Yang*, Thomas Bourgeat, Jules Drean, Joel Emer, Mengjia Yan
In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), April 2022. [paper] -
Mentoring Opportunities in Computer Architecture: Analyzing the Past to Develop the Future
Elba Garza, Gururaj Saileshwar, Tianyi Liu, Abdulrahman Mahmoud, Saugata Ghose, Joel Emer
In Workshop on Computer Architecture Education (WCAE) at the International Symposium on Computer Architecture (ISCA), June 2021 [paper] [slides] [lightning talk] [talk+panel] -
SpZip: Architectural Support for Effective Data Compression In Irregular Applications.
Yifan Yang, Joel Emer, Daniel Sanchez
Proceedings of the 48th Annual International Symposium on Computer Architecture (ISCA-44) June 2021. [paper] -
Gamma: Exploiting Gustavson's Algorithm to Accelerate Sparse Matrix Multiply.
Guowei Zhang, Nithya Attaluri, Joel Emer, Daniel Sanchez
In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), April 2021. [paper] -
Casa: End-to-end quantitative security analysis of randomly mapped caches..
T. Bourgeat, J. Drean, Y. Yang, L. Tsai, J. Emer, and M. Yan.
In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 1110--1123, 2020. [paper] -
An architecture-level energy and area estimator for processing-in-memory accelerator designs.
Y. N. Wu, V. Sze, and J. S. Emer.
In 2020 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pages 116--118, 2020. -
Digital optical neural networks for large-scale machine learning.
L. Bernstein, A. Sludds, R. Hamerly, V. Sze, J. Emer, and D. Englund.
In 2020 Conference on Lasers and Electro-Optics (CLEO), pages 1--2, 2020. -
MAGNet: A Modular Accelerator Generator for Neural Networks.
Rangharajan Venkatesany, Yakun Sophia Shao, Miaorong Wang, Jason Clemons, Steve Dai, Matthew Fojtik, Ben Keller, Alicia Klinefelter, Nathaniel Pinckney, Priyanka Raina, Yanqing Zhang, Brian Zimmer, William J. Dally, Joel Emer, Stephen W. Keckler, and Brucek Khailany.
In Proceedings of the International Conference on Computer-Aided Design (ICCAD), November 2019. [paper] -
Accelergy: An Architecture-Level Energy Estimation Methodology for Accelerator Designs.
Yannan Nellie Wu, Joel Emer, and Vivienne Sze.
In Proceedings of the International Conference on Computer-Aided Design (ICCAD), November 2019. [paper] - Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture.
Yakun Sophia Shao, Jason Clemons, Rangharajan Venkatesan, Brian Zimmer, Matthew Fojtik, Nan Jiang, Ben Keller, Alicia K linefelter, Nathaniel Pinckney, Priyanka Raina, Stephen G. Tell, Yanqing Zhang, William J. Dally, Joel Emer, C. Thomas Gray, Br ucek Khailany, and Stephen W. Keckler.
In 2019 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), October 2019.
(Micro 2019 "Best Paper" and ACM 2020 "Research Highlight") [paper] -
ExTensor: An Accelerator for Sparse Tensor Algebra.
Kartik Hegde, Michael Pellauer, Hadi Asghari-Moghaddam, Michael Pellauer, Neal Crago, Aamer Jaleel, Edgar Solomonik, Jo el Emer, and Christopher W. Fletcher.
In 2019 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), October 2019.
(IEEE Micro’s Top Picks 2019 Honorable Mention) [paper] -
A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator Designed with a High-P
roductivity VLSI Methodology.
R. Venkatesan, Y. S. Shao, B. Zimmer, J. Clemons, M. Fojtik, N. Jiang, B. Keller, A. Klinefelter, N. Pinckney, P. Raina , S. G. Tell, Y. Zhang, W. J. Dally, J. S. Emer, C. T. Gray, S. W. Keckler, and B. Khailany.
In Hot Chips, August 2019. -
A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator with Ground-Reference Signaling in 16nm.
B. Zimmer, R. Venkatesan, Y. S. Shao, J. Clemons, M. Fojtik, N. Jiang, B. Keller, A. Klinefelter, N. Pinckney, P. Raina, S. G. Tell, Y. Zhang, W. J. Dally, J. S. Emer, C. T. Gray, S. W. Keckler, and B. Khailany.
In 2019 Symposium on VLSI Circuits, June 2019. [paper] -
Buffets: An Efficient and Composable Storage Idiom for Explicit Decoupled Data Orchestration.
Michael Pellauer, Yakun Sophia Shao, Jason Clemons, Neal Crago, Kartik Hegde, Rangharajan Venkatesan, Stephen W. Keckler, Christopher W. Fletcher, and Joel Emer.
In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, April 2019.
(IEEE Micro’s Top Picks 2019 Honorable Mention) [paper] -
Timeloop: A Systematic Approach to DNN Accelerator Evaluation.
A. Parashar, P. Raina, Y. S. Shao, Y. Chen, V. A. Ying, A. Mukkara, R. Venkatesan, B. Khailany, S. W. Keckler, and J. Emer.
In 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), March 2019. [paper] -
Harmonizing Speculative and Non-Speculative Execution in Architectures for Ordered Parallelism.
M. C. Jeffrey, V. A. Ying, S. Subramanian, H. R. Lee, J. Emer, and D. Sanchez.
In 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), October 2018. [paper] -
DAWG: A Defense Against Cache Timing Attacks in Speculative Execution Processors.
V. Kiriansky, I. Lebedev, S. Amarasinghe, S. Devadas, and J. Emer.
In 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), October 2018. [paper] -
A Modular Digital VLSI Flow for High-Productivity SoC Design.
B. Khailany, E. Krimer, R. Venkatesan, J. Clemons, J. S. Emer, M. Fojtik, A. Klinefelter, M. Pellauer, N. Pinckney, Y. S. Shao, S. Srinath, C. Torng, S. L. Xi, Y. Zhang, and B. Zimmer.
In 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), June 2018. [paper] -
Understanding the Limitations of Existing Energy-Efficient Design Approaches for Deep Neural Networks.
Y. Chen, T. Yang, J. Emer, and V. Sze.
In SysML Conference, February 2018. [paper] -
Stitch-X: An Accelerator Architecture for Exploiting Unstructured Sparsity in Deep Neural Networks.
C. Lee, Y. Shao, J. Zhang, A. Parashar, J. Emer, S. Keckler, and Z. Zhang.
In SysML Conference, February 2018. [paper] -
Efficient Processing of Deep Neural Networks: A Tutorial and Survey.
V. Sze, Y. Chen, T. Yang, and J. S. Emer.
Proceedings of the IEEE, 105(12):2295-2329, December 2017. [paper] -
Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications.
Guanpeng Li, Siva Kumar Sastry Hari, Michael Sullivan, Timothy Tsai, Karthik Pattabiraman, Joel Emer, and Stephen W. Keckler.
In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, November 2017. [paper] -
A method to estimate the energy consumption of deep neural networks.
T. Yang, Y. Chen, J. Emer, and V. Sze.
In 2017 51st Asilomar Conference on Signals, Systems, and Computers, October 2017. [paper] -
Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications,
Guanpeng Li, Siva Hari, Michael Sullivan, Timothy Tsai, Karthik Pattabiraman, Joel Emer, Stephen W. Keckler,
Supercomputing 2017, November 2017. [paper] -
SAM: Optimizing Multithreaded Cores for Speculative Parallelism,
Maleen Abeydeera, Suvinay Subramanian, Mark C. Jeffrey, Joel Emer, Daniel Sanchez,
in Proceedings of the 26th international conference on Parallel Architectures and Compilation Techniques (PACT-26), September 2017. [paper]
-
SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks,
Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkatesan, Brucek Khailany, Joel Emer, Stephen W Keckler, William J Dally
Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA-44) June 2017. [paper] -
Fractal: An Execution Model for Fine-Grain Nested Speculative Parallelism,
Suvinay Subramanian, Mark C. Jeffrey, Maleen Abeydeera, Hyun Ryong Lee, Victor A. Ying, Joel Emer, Daniel Sanchez,
Proceedings of the 44th International Symposium in Computer Architecture (ISCA-44), June 2017. [paper]
-
Towards Closing the Energy Gap Between HOG and CNN Features for Embedded Vision,
Amr Suleiman, Yu-Hsin Chen, Joel Emer, Vivienne Sze,
IEEE International Symposium of Circuits and Systems (ISCAS), Invited Paper, May 2017. [paper] -
Hardware for Machine Learning: Challenges and Opportunities,
Vivienne Sze, Yu-Hsini Chen, Joel Emer, Amr Suleiman, Zhengdong Zhang,
IEEE Custom Integrated Circuits Conference (CICC), Invited Paper, April 2017.
(Received award for best invited paper). [paper] -
SASSIFI: An architecture-level fault injection tool for GPU application resilience evaluation
Siva Kumar Sastry Hari, Timothy Tsai, Mark Stephenson, Stephen W Keckler, Joel Emer,
2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), April 2017. [paper] -
Automatic Construction of Program-Optimized FPGA Memory Networks,
Hsin-Jung Yang, Kermin Fleming, Felix Winterstein, Annie I Chen, Michael Adler, Joel Emer,
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, February 2017. [paper] -
Data-Centric Execution of Speculative Parallel Programs,
Mark C. Jeffrey, Suvinay Subramanian, Maleen Abeydeera, Joel Emer, Daniel Sanchez,
in Proceedings of the 49th annual IEEE/ACM international symposium on Microarchitecture (MICRO-49), October 2016.
(IEEE Micro’s Top Picks 2016 Honorable Mention) [paper]
-
CLARA: Circular Linked-List Auto and Self Refresh Architecture,
Aditya Agrawal, Mike O'Connor, Evgeny Bolotin, Niladrish Chatterjee, Joel Emer, Stephen Keckler,
Proceedings of the Second International Symposium on Memory Systems October 2016. [paper] -
Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks,
Yu-Hsin Chen, Joel Emer, Vivienne Sze,
International Symposium on Computer Architecture (ISCA), June 2016.
(IEEE Micro's Top Picks 2016) [paper] -
LMC: Automatic Resource-Aware Program-Optimized Memory Partitioning,
Hsin-Jung Yang, Kermin Fleming, Michael Adler, Felix Winterstein, Joel Emer,
24th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (ISFPGA), February 2016. [paper] -
Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks,
Yu-Hsin Chen, Tushar Krishna, Joel Emer, Vivienne Sze,
IEEE International Conference on Solid-State Circuits (ISSCC), February 2016. [paper] [slides] (Project website) -
A Fast and Accurate Analytical Technique to Compute the AVF of Sequential Bits in a Processor
Steven Raasch, Arijit Biswas, Jon Stephan, Paul Racunas, Joel Emer,
International Symposium on Microarchitecture (MICRO), December 2015. [paper] -
A Scalable Architecture for Ordered Parallelism
Mark C. Jeffrey, Suvinay Subramanian, Cong Yan, Joel Emer, Daniel Sanchez,
International Symposium on Microarchitecture (MICRO), December 2015.
(IEEE Micro's Top Picks 2015). [paper] -
Scavenger: Automating the Construction of Application-Optimized Memory Hierarchies
Hsin-Jung Yang, Kermin Fleming, Michael Adler, Felix Winterstein, Joel Emer,
International Conference on Field Programmable Logic and Applications (FPL), London, England, September 2015. [paper] -
SASSIFI: Evaluating Resilience of GPU Applications
Siva Kumar Sastry Hari, Timothy Tsai, Mark Stephenson, Steve Keckler and Joel Emer,
SELSI, March 2015. -
High Performing Cache Hierarchies for Server Workloads -- Relaxing Inclusion to Capture the Latency Benefits of Exclusive Caches,
Aamer Jaleel, Joseph Nuzman, Adrian Moga, Simon Steely, and Joel Emer,
Industry Session of International Symposium on High Performance Computer Architecture (HPCA), San Francisco, CA, February 2015. [paper] -
The LEAP FPGA Operating System,
Kermin Fleming, Hsin-Jung Yang, Michael Adler, Joel Emer,
International Conference on Field Programmable Logic and Applications (FPL), Munch, Germany; September 2014. [paper] -
LEAP Shared Memories: Automating the Construction of FPGA Coherent Memories,
Hsin-Jung Yang, Kermin Fleming, Michael Adler, Joel Emer,
IEEE International Symposium on Field-Programmable Custom Computing Machines, Boston, Massachusetts, May 2014. [paper] -
Exploiting Spatial Architectures for Edit Distance Algorithms,
Jesmin Tithi, Neal Crago, Joel Emer,
ISPASS 2014, March 2014. Best paper nomination. [paper] -
Triggered Instructions: A Control Paradigm for Spatially-Programmed Architectures,
Angshuman Parashar, Michael Pellauer, Michael Adler, Bushra Ahsan, Neal Crago, Daniel Lustig, Vladimir Pavlov, Antonia Zhai, Mohit Gambhir, Aamer Jaleel, Randy Allmon, Rachid Rayess, Stephen Maresh, Joel Emer,
In International Symposium on Computer Architecture (ISCA), June 2013.
Selected for (IEEE Micro's Top Picks 2014). [paper] -
A Hierarchical Architectural Framework for Reconfigurable Logic Computing,
Peng Li, A. Parashar, M. Pellauer, Tao Wang, J. Emer,
IEEE 27th International Parallel and Distributed Processing Symposium (IPDPSW), January 2013. -
Optimizing under abstraction: Using prefetching to improve FPGA performance,
Hsin-Jung Yang, K. Fleming, M. Adler, J. Emer,
23rd International Conference on Field Programmable Logic and Applications (FPL), January 2013. [paper] -
Scheduling Heterogeneous Multi-Cores through Performance Impact Estimation (PIE),
Kenzo Van Craeynest, Aamer Jaleel, Lieven Eeckhout, Paolo Narvaez, Joel Emer,
In International Symposium on Computer Architecture (ISCA), June 2012. [paper] -
CRUISE: Cache Replacement and Utility-aware Scheduling,
Aamer Jaleel, Hashem H. Najaf-abadi, Samantika Subramaniam, Simon C. Steely, Joel Emer,
Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2012. [paper] -
Leveraging Latency-insensitivity to Ease Multiple FPGA Design,
Kermin Elliott Fleming, Michael Adler, Michael Pellauer, Angshuman Parashar, Arvind, and Joel S. Emer,
Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays (FPGA 2012), February 2012. [paper] -
ZIP-IO: Architecture for application-specific compression of Big Data,
Sang Woo Jun, K.E. Fleming, M. Adler, J. Emer,
2012 International Conference on Field-Programmable Technology (FPT), January 2012. [paper] -
SHiP: Signature-based Hit Predictor for High Performance Caching,
Carole-Jean Wu, Aamer Jaleel, William Hasenplaugh, Margaret Martonosi, Simon C. Steely Jr, Joel Emer,
International Symposium on Microarchitecture (MICRO), December 2011. [paper] -
PACMan: Prefetch-Aware Cache Management for High Performance Caching,
Carole-Jean Wu, Aamer Jaleel, Margaret Martonosi, Simon C. Steely Jr, Joel Emer,
International Symposium on Microarchitecture (MICRO), December 2011. [paper] -
Leap Scratchpads: Automatic Memory and Cache Management for Reconfigurable Logic,
Michael Adler, Kermin Fleming, Angshuman Parashar, Michael Pellauer, Joel S. Emer,
Proceedings of the ACM/SIGDA 19th International Symposium on Field Programmable Gate Arrays, FPGA 2011, February, 2011. [paper] -
HAsim: FPGA-based High-detail Multicore Simulation Using Time-division Multiplexing,
Michael Pellauer, Michael Adler, Michel A. Kinsy, Angshuman Parashar, and Joel S. Emer,
Proceeding of: 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), February, 2011. [paper] -
Achieving Non-Inclusive Cache Performance With Inclusive Caches -- Temporal Locality Aware (TLA) Cache Management Policies,
Aamer Jaleel, Eric Borch, Malini Bhandaru, Simon C. Steely Jr, Joel Emer,
International Symposium on Microarchitecture (MICRO), Atlanta, Georgia, December 2010. [paper] -
High Performance Cache Replacement Using Re-Reference Interval Prediction (RRIP),
Aamer Jaleel, Kevin Theobald, and Simon Steely Jr., Joel Emer,
Proceedings of the 37nd Annual Symposium on Computer Architecture, June 2010. [paper] -
CAMP: A Technique to Estimate Per-Structure Power at Run-time using a Few Simple Parameters,
Michael D. Powell, Arijit Biswas, Joel Emer, Shubhendu S. Mukherjee, Basit R. Sheikh, Shrirang Yardi,
HPCA-15, February 2009. [paper] -
Quick Performance Models Quickly: Closely-Coupled Partitioned Simulation on FPGAs,
Michael Pellauer, Muralidaran Vijayaraghavan, Michael Adler, Arvind, Joel Emer,
Proceedings of ISPASS-2008, April 2008. [paper] -
Soft connections: Addressing the Hardware-design Modularity Problem,
Michael Pellauer, Michael Adler, Derek Chiou, Joel S. Emer,
Proceedings of the 46th Design Automation Conference, DAC 2009, July, 2009. [paper] -
Adaptive Insertion Policies for Managing Shared Caches on CMPs,
Aamer Jaleel, William Hasenplaugh, Moinuddin Qureshi, Julien Sebot, Simon C. Steely Jr, and Joel Emer,
In the International Conference on Parallel Architectures and Compiler Techniques (PACT), Toronto, Canada, October 2008. [paper] -
A-Ports: An Efficient Abstraction for Cycle-Accurate Models on FPGAs,
Michael Pellauer, Muralidaran Vijayaraghavan, Michael Adler, Arvind, Joel Emer,
Proceedings of FPGA-08, February 2008. [paper] -
Adaptive Insertion Policies for High Performance Caching,
Moinuddin Qureshi, Aamer Jaleel, Yale Patt, Simon Steely, Joel Emer,
Proceedings of the 34nd Annual Symposium on Computer Architecture, June 2007.
(IEEE Micro's Top Picks 2008). [paper] -
Late-Binding: Enabling Unordered Load-Store Queues,
Simha Sethumadhavan, Franzi Roesner, Joel Emer, Doug Burger, Steve Keckler,
Proceedings of the 34nd Annual Symposium on Computer Architecture, June 2007. [paper] -
Computing the Architectural Vunerability Factor for Address-Based Structures,
Arijit Biswas, Paul Racunas, Raz Cheveresan, Joel Emer, Shubhendu S. Mukherjee, Ram Rangan,
Proceedings of the 32nd Annual Symposium on Computer Architecture, June 2005. [paper] -
The Soft Error Problem: an Architectural Perspective,
Shubhendu S. Mukherjee, Joel Emer, Steven K. Reinhardt,
HPCA, Feb. 2005. [paper] -
Techniques to Reduce the Soft Error Rate of a High-Performance Microprocessor,
Christopher Weaver, Joel Emer, Shubhendu S. Mukherjee, Steven K. Reinhardt,
Proceedings of the 31st Annual Symposium on Computer Architecture, June 2004.
(IEEE Micro's Top Picks 2004). [paper] -
Cache Scrubbing in Microprocessors: Myth or Necessity?,
Shubhendu S. Mukherjee, Joel Emer, Tryggve Fossum, and Steven K. Reinhardt,
10th IEEE PRDC, March 2004. [paper] -
A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor,
Shubhendu Mukherjee, Christopher Weaver, Joel Emer, Steve Reinhardt, and Todd Austin,
MICRO, December 2003.
(IEEE Micro's Top Picks 2003). [paper] -
A Comparative Study of Arbitration Algorithms for the Alpha 21364 Router Pipeline,
Shubhendu Mukherjee, Federico Silla, Peter Bannon, Joel Emer, Steve Lang, Dave Webb,
Tenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), October 2002. [paper] -
Tarantula: A Vector Extension to the Alpha Architecture,
Roger Espasa, Federico Ardanaz, Joel Emer, Stephen Felix, Julio Gago, Roger Gramunt, Isaac Hernandez, Toni Juan, Geoff Lowney, Matthew Mattina, and Andre Seznec,
Proceedings of the 29st Annual Symposium on Computer Architecture, June 2002. [paper] -
ASIM: A performance model framework,
Joel Emer, Pritpal Ahuja, Eric Borch, Artur Klauser, Chi-Keung Luk, Srilatha Manne, Shubhendu S. Mukherjee, Harish Patil, Steven Wallace, Nathan Binkert, Roger Espasa, Toni Juan,
IEEE Computer, Feb. 2002. [paper] -
Loose Loops Sink Chips,
Eric Borch, Eric Tune, Srilatha Manne, Joel Emer,
HPCA8, Feb. 2002. [paper] -
Combining Static and Dynamic Branch Prediction to Reduce Destructive Aliasing,
Harish Patil, Joel Emer,
HPCA6, Jan. 2000. [paper] -
The Use of Multithreading for Exception Handling,
Craig B. Zilles, Joel S. Emer, Gurindar S. Sohi,
Micro-32, Nov. 1999. [paper] -
Memory Dependence Prediction using Store Sets,
George Chrysos, Joel Emer,
Proceedings of the 25th Annual Symposium on Computer Architecture, June 1998. [paper] -
A Language for Describing Predictors and its Application to Automatic Synthesis,
Joel Emer, Nick Gloy,
Proceedings of the 24th Annual Symposium on Computer Architecture, June 1997. [paper] -
Architecture of a Flexible Real-time Video Encoder/Decoder: The DECchip 21230,
Matthew Adiletta, Debra Bernstein, Samuel Ho, Joel Emer, Bill Wheeler,
Proceedings of SPIE/IS&T Electronic Imaging Symposium, October 1996. -
Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multitheading Processor,
Dean Tullsen, Susan Eggers, Joel Emer, Hank Levy, Jack Lo and Rebecca Stamm,
Proceedings of the 23th Annual Symposium on Computer Architecture, May 1996.
(Reprinted in "Readings in Computer Archictecture" by Hill, Jouppi and Sohi, 2000. Winner, ISCA Test of Time award 2011). [paper] -
Predictive Sequential Associative Cache,
Brad Calder and Dirk Grunwald, Joel Emer,
Proceedings of HPCA-2, February 1996. [paper] -
A System Level Perspective on Branch Architecture Performance,
Brad Calder, Dirk Grunwald, Joel Emer,
Proceedings of Micro-28, November 1995. [paper] -
Instruction Fetching: Coping with Code Bloat,
Rich Uhlig, David Nagle, Trevor Mudge, Stuart Sechrest, Joel Emer,
Proceedings of the 22th Annual Symposium on Computer Architecture, June 1995. [paper] -
Integrated Access to Heterogeneous Distributed Services,
J.S. Emer, W.E. Weihl,
Proceedings of Winter USENIX, January 1990. -
Performance Considerations for Distributing Services - A Case Study: Mass Storage
J.S. Emer, K.K. Ramakrishnan,
Proceedings of the 8th International Conference on Distributed Computing Systems, June 1988. -
A Model of File Server Performance for a Heterogeneous Distributed System,
K.K. Ramakrishnan, J.S. Emer,
Proceedings of the ACM SIGCOMM &86 Symposium, August 1986. -
A Programmable Interface Language for Heterogeneous Distributed Systems,
J.S. Emer, J.R. Falcone,
DEC Technical Report, DEC-TR-371, August 1985. -
A Characterization of Processor Performance in the VAX-11/780
J.S. Emer, D. W. Clark,
Proceedings of the 11th International Conference on Computer Architecture, May 1984.
(Reprinted in 25 Years of the International Symposium on Computer Architecture, 1999, and Readings in Computer Architecture, 2000). [paper] -
Control Store Organization for Multiple Steam Pipelined Processors,
J.S. Emer, E. S. Davidson,
Proceedings of the 1978 International Conference on Parallel Processing, August 1978.
Other Publications
-
Freely scalable and reconfigurable optical hardware for deep learning,
Liane Bernstein, Alexander Sludds, Ryan Hamerly, Vivienne Sze, Joel Emer, and Dirk Englund.
arXiv, 2020. [arXiv] -
Estimating Silent Data Corruption Rates Using a Two-Level Model
Siva Kumar Sastry Hari, Paolo Rech, Timothy Tsai, Mark Stephenson, Arslan Zulfiqar, Michael Sullivan, Philip Shirvani, Paul Racunas, Joel Emer, Stephen W. Keckler.
arXiv, 2020. [arXiv] -
Efficient Processing of Deep Neural Networks: A Tutorial and Survey
Vivienne Sze, Tien.-Ju Yang, Yu-Hsin Chen, Joel Emer.
arXiv, August 2017. [paper]