6.896: Probability and Computation
Recent years have seen a remarkable interplay between Computer Science and Probability. The ability of computers to toss coins has lead to algorithms that are more efficient, conceptually simpler and more elegant than their best known deterministic counterparts. At the same time, research in computation has influenced the questions and techniques of probability theory.
This class aims to study advanced probabilistic techniques and their applications in computation, drawing examples from algorithms, random structures and combinatorics. The main focus of the class will revolve around Markov chain Monte Carlo theory and applications, Statistical Physics and applications to Phylogenetics, and, if time permits, Harmonic Analysis of Boolean functions and applications to Computational Learning Theory, and Smoothed Analysis. A tentative syllabus is presented below.
- Instructor: Constantinos Daskalakis
Office Hours: Send me an email.
- TA: Ankur Moitra
Office Hours: Send Ankur an email.
- Lecture: Mondays and Wednesdays 1:00-2:30, ROOM: 2-105.
- Textbook: Lecture notes and/or presentations will be provided.
- Prerequisites: A course in algorithms (6.046 or equivalent), probability (6.041 or equivalent), and discrete mathematics (6.042 or equivalent).
- Assessment: Students will be required to scribe one or two lectures, do a small number of homework problems, and a research-oriented final project. This could be either a survey, or original research; students will be encouraged to relate the project to their own research interests.
There is a significant dynamic component to the course, as topics drop
in and out, or get longer or shorter treatment, depending on audience
interest/reaction/resistence. We consider this a feature. Given this,
here is a rough outline of the course material:
- Markov Chain Monte Carlo
- basics of Markov chains: ergodicity, stationary distributions, reversibility, mixing time, cover time;
- bounds on mixing: conductance, multicommodity flows and canonical paths, coupling, advanced coupling techniques (path coupling, coupling from the past), eigenvalues, Cheeger's Inequality, average conductance;
- applications of MCMC: card shuffling, independent sets and colorings, approximate counting, volume and integration, approximating the permanent, biased sampling, network reliability.
- Statistical Physics and Phylogenetics
- Statistical Physics: Gibbs measures, the uniqueness problem, phase transitions, correlation decay, the Ising model;
- Phylogenetics: introduction, the quartet method, parsimony, maximum likelihood, connection to the Ising model, Steel's conjecture.
- Computational Learning Theory (if time permits)
- PAC Learning model, hardness of proper learning, VC-dimension and applications to learning, inherent unpredictability;
- harmonic analysis, noise sensitivity, influence, applications to learning intersections of halfspaces and learning juntas.
- Smoothed Analysis (if time permits)
- smoothed analysis model, applications to TSP and the perceptron algorithm;
- discrete smoothed analysis, smoothed complexity of knapsack, equivalence between smoothed polynomial time and pseudopolynomial time, Pareto optimal solutions.
Please use the scribe template to produce your notes. The template contains commands for Theorem, Lemma, Corollary, Proposition, Claim, Definition, Construction, Remark etc. An example file where this template is used is here .
- Lecture 1 (February 2nd): Introduction to the material. Card Shuffling; the Markov Chain Monte Carlo paradigm; mixing time; sampling to counting; Gibbs distributions; the Ising Model; critical temperatures and phase transitions; Glauber dynamics; phylogenetic reconstruction and the Ising model on the tree. Slides; Slides in no-animation pdf file.
- Lecture 2 (February 7th): Basic Definitions: Irreducibility; Aperiodicity; Stationarity; Reversibility; the Fundamental Theorem of Markov Chains; Lazy Markov Chains. Slides; Slides in no-animation pdf file; scribe notes.
- Lecture 3 (February 9th): Examples of Markov Chains; the Metropolis Process; (algebra-free) proof of the Fundamental Theorem of Markov Chains. Slides (part a of lecture); Slides (part a) in no-animation pdf file; notes (part b of lecture); scribe notes.
- Lecture 4 (February 14th): Bounding Mixing via Coupling; Mixing Time and Rates of Convergence; Transposition Shuffle. Notes; scribe notes.
- Lecture 5 (February 16th): Top-in-at-Random Shuffle; Sharp Cutoffs; Bounding Mixing via Strong Stationary Times; Riffle Shuffle. Notes; scribe notes.
- Lecture 6 (February 22nd): Sampling Graph Colorings; Path Coupling. Notes; scribe notes.
- Lecture 7 (February 24th): Applications of Path Coupling; Sampling Graph Colorings; Sampling Linear Extensions. Notes; scribe notes.
- Lecture 8 (February 28th): Coupling from the past. Notes; scribe notes.
- Lecture 9 (March 2nd): Coupling from the past: applications to monotone settings; the Ising model and the Hardcore model. Notes; scribe notes.
- Lecture 10 (March 7th): Bounding mixing time using multicommodity flows. Notes.
- Lecture 11 (March 9th): Examples of using multicommodity flows to bound mixing. Notes.
- Lecture 12 (March 14th): Flow Encodings for bounding the mixing time; application to sampling matchings. Notes; scribe notes.
- Lecture 13 (March 16th): Analysis of the Jerrum-Sinclair Markov Chain for Sampling Matchings; computing the matching polynomial. Notes; scribe notes.
- Lecture 14 (March 28th): Using Jerrum-Sinclair to count the number of matchings. Notes.
- Lecture 15 (March 30th): The class #P; reducing sampling to counting colorings. Scribes.
- Lecture 16 (April 4th): Self-reducible NP search problems; Equivalence of Counting and Sampling; All-or-nothing Approximability. Notes.
- Lecture 17 (April 6th): Ising model on the 2-dimensional lattice; Spatial to temporal mixing I. Notes.
- Lecture 18 (April 11th): Ising model on the 2-dimensional lattice; Spatial to temporal mixing II. Notes.
- Lecture 19 (April 13th): Introduction to Phylogenetics: Evolution, Phylogenetic trees, Information, Quartet methods. Slides; Supplementary Notes.
- Lecture 20 (April 20th): Tree metrics; trees from tree metrics; Markov Chains on trees; the tree reconstruction problem. Notes.
- Lecture 21 (April 25th): Solving the Tree Reconstruction Problem from Polynomial Sequences; Steel's Conjecture. Notes.
- Lecture 22 (April 27th): Towards Logarithmic Sequences; the Ancestral Reconstruction Problem; Phase Transition of the Ancestral Reconstruction Problem. Notes.
- Lecture 23 (May 4th): Using Ancestral Reconstruction for Tree Reconstruction from short sequences. Steel's conjecture Slides.
- Lecture 24(May 9th) Project Presentations.
- Lecture 24(May 11th) Project Presentations.
List of Readings
Other classes with useful resources:
D. Weitz: Counting independent sets up to the tree threshold.
M. Dyer: Approximate counting by dynamic programming.
F. Martinelli, A. Sinclair, D. Weitz: Glauber dynamics on trees: boundary conditions and mixing time.
E. Mossel, A. Sly: Rapid mixing of Gibbs sampling on graphs that are sparse on average.
Y. Peres, P. Winkler: Censoring Inequality, see chapter 16.4 here.
F. Chen , L. Lovasz , I. Pak: Lifting Markov Chains to Speed up Mixing. (taken by Brian)
L. Trevisan: Approximation algorithms for unique games.
L. Lovasz, M. Simonovitz: The mixing rate of Markov chains, an isoperimetric inequality, and computing the volume.
D. Spielman and S. H. Teng: A local clustering algorithm for massive graphs and its applications to nearly-linear time graph partitioning.
R. Khandekar, S. Rao, U. Vazirani: Graph partitioning using single commodity flows.
F. Chierichetti, S. Lattanzi, A. Panconesi: Almost tight bounds for rumor spreading with conductance.
E. Mossel: Phase transitions in phylogeny.
E. Mossel, M. Steel: How much can evolved characters tell us about the tree that generated them?
C. Daskalakis, S. Roch: Alignment-Free Phylogenetic Reconstruction.
C. Daskalakis, E. Mossel, S. Roch: Optimal Phylogenetic Reconstruction. (taken by Maciej)