TBD
GPU Computing
Location, Time
Instructor: Xuhao Chen
Office Hours: Time, Location
Course Description
This course is an introduction to parallel computing using graphics processing units (GPUs). We will be focussing on CUDA programming, but the concepts taught will apply to other GPU frameworks as well. The course will start by covering CUDA syntax extensions and the CUDA runtime API, then move on to more advanced topics such as bandwidth optimization, memory access performance, and floating point considerations. We will learn about common parallel computing patterns such as scans and reductions, and study use cases for GPU acceleration such as matrix multiplication and convolution.
Prerequisites
As CUDA is an extension of the C language, students taking this course should be familiar with C programming.
Prior knowledge of computer architecture concepts such as data locality will be useful but not required.
Grading
Grades for this course will be based on a series of 3-5 programming assignments designed to allow students to apply GPU programming skills taught in the lectures.
Textbook (Optional)
Programming Massively Parallel Processors, Third Edition: A Hands-on Approach
David B. Kirk and Wen-mei W. Hwu.
The Second Edition is online available here
Computing Resources
For the programming assignments, students will need access to a computer with a CUDA-compatible GPU. I can help arrange access to a remote CUDA-capable machine for students without local access.
The NVIDIA Deep Learning Institute (DLI) Teaching Kit Program
WebGPU.com A System for Online GPU Development
Fundamentals of Accelerated Computing with CUDA Python
Teach GPU Accelerating Computing: Hands-on with NVIDIA Teaching Kit for Educators