18.408 Theoretical Foundations for Deep Learning

Spring 2021




Deep learning has sparked a revolution across machine learning. It has led to major advancements in vision, speech, playing strategic games, and the sciences. And yet it remains largely a mystery. We do not understanding why the algorithms that we use work so well in practice.

In this class we will explore theoretical foundations for deep learning, emphasizing the following themes: (1) Approximation: What sorts of functions can be represented by deep networks, and does depth provably increase the expressive power? (2) Optimization: Essentially all optimization problems we want to solve in practice are non-convex. What frameworks can be used to analyze such problems? (3) Beyond-Worst Case Analysis: Deep networks can memorize worst-case data, so why do they generalize well on real-world data? For this and related questions, our starting point will often be natural data-generative models. The theory of deep learning is still very much a work-in-progress. Our goal in this course is merely to explain some of the key questions that drive the this area, and take a critical look at where the existing theory falls short.

We will cover topics such as: Barron's theorem, depth separations, landscape analysis, implicit regularization, neural tangent kernels, generalization bounds, data poisoning attacks and frameworks for proving lower bounds against deep learning.

Announcement: You should join the course slack where we will post all lectures, notes, homeworks, etc.

Course Information

Course Outline

Here is a tentative outline for the course:

Instructor Notes

Materials and Links

Here are links to some other resources you might find helpful: