Speaker | Anima Anandkumar (University of California, Irvine, CA, USA) |
Title | High-Dimensional Graphical Model Selection: Tractable Graph Families
and Regimes |
Abstract |
Capturing complex interactions among a large set of variables is a
challenging task. Probabilistic graphical models or Markov random fields
provide a graph-based framework for capturing such dependencies. Graph
estimation is an important task, since it reveals important relationships
among the variables. I will present a unified view of graph estimation and
propose a simple local algorithm for graph estimation using only low-order
statistics of the data. We establish that the algorithm has consistent graph
estimation with low sample complexity for a class of graphical models
satisfying certain structural and parameter criteria. We explicitly
characterize these model classes and point out interesting relationships
between the graph structure and the parameter regimes, required for
tractable learning. Many graph families such as the classical
Erd{\H o}s--R{\'e}nyi
random graphs, random regular graphs, and the small-world graphs can be
learnt efficiently under our framework.
|
Slides | [.pdf] |
Reading list |
A. Anandkumar, V.Y.F Tan, and A.S. Willsky
High-dimensional structure learning of Ising models: tractable graph families [.pdf] preprint, June 2011. D. Weitz Counting independent sets up to the tree threshold [.pdf] Proc. STOC 2006. A. Anandkumar, V.Y.F Tan, and A.S. Willsky High-dimensional Gaussian graphical model selection: tractable graph families [.pdf] preprint, 2011. |
Speaker | Alexander Barvinok (University of Michigan, Ann Arbor, MI, USA) |
Title | Counting Integer Points in Polyhedra |
Abstract | I'll attempt to review some general approaches to efficient (polynomial time) counting of integer points in polyhedra. In fixed dimension, there are exact methods based on "short rational generating functions." If the dimension is allowed to vary, exact counting is almost always out of the question, and we have to settle for approximate counting, which is often based on probabilistic ideas, although the resulting methods can be deterministic. |
Slides | [.pdf] |
Reading list |
A. Barvinok
The complexity of generating functions for integer points in polyhedra and beyond [.pdf] Proceedings of the International Congress of Mathematicians, 2006. A. Barvinok and J.A. Hartigan Maximum entropy Gaussian approximations for the number of integer points and volumes of polytopes [.pdf] Advances in Applied Mathematics, 2010. |
Speaker | Jin-Yi Cai (University of Wisconsin, Madison, WI, USA) |
Title | Complexity Dichotomy Theorems for Counting Problems |
Abstract |
I will describe several complexity dichotomy theorems for counting
problems. Lov\'asz (1967) defined graph homomorphism to encompass a very
general class of graph properties. The computational problem can be stated
as follows: Given an $m \times m$ symmetric matrix $A$ over the complex
field, compute the function $Z_A(\cdot)$, where for an arbitrary input
graph~$G$,
\[ Z_A(G) = \sum_{\xi:V(G)\rightarrow [m]} \ \prod_{(u,v)\in E(G)}
A_{\xi(u),\xi(v)}.\] Recently a complete dichotomy theorem has been
achieved for this problem. I will give an introduction to this proof.
I also plan to discuss a dichotomy theorem which characterizes all solvable planar Holant and CSP problems over the Boolean domain. This assumes arbitrary symmetric real-valued local constraint functions. The interesting characterization is that, within the stated framework, Valiant's holographic algorithms with matchgates provide a universal methodology to all counting problems which are #P-hard on general graphs, but solvable on planar graphs in polynomial time. (Based on joint work with Xi Chen, Pinyan Lu, and Mingji Xia.) |
Slides | [.pdf] |
Reading list |
J.-Y. Cai, X. Chen, and P. Lu
Graph Homomorphisms with Complex Values: A Dichotomy Theorem [arxiv] arxiv, 2009. J.-Y. Cai, P. Lu, M. Xia Holographic Algorithms with Matchgates Capture Precisely Tractable Planar #CSP [arxiv] arxiv, 2010. |
Speaker | Xi Chen (Columbia University, New York, NY, USA) |
Title | Complexity of Counting CSP with Complex Weights |
Abstract |
We prove a complexity dichotomy theorem for all Counting Constraint
Satisfaction Problems (#CSP) with complex weights. We give three conditions
for tractability. Let $F$ be any finite set of complex functions,
then we show
that the #CSP defined by $F$ is solvable in polynomial time if all three
conditions are satisfied; and is #P-hard otherwise.
Our dichotomy generalizes a long series of important results on counting problems: (a) the problem of counting graph homomorphisms is the special case when there is a single symmetric binary function in $F$, for which a dichotomy is known for all complex functions; (b) the problem of counting directed graph homomorphisms is the special case when there is a single not necessarily symmetric binary function in $F$, for which a dichotomy is only known for non-negative functions before; and (c) the standard form of #CSP is when all functions in $F$ are 0--1 valued, for which a dichotomy is given by Bulatov. Dyer and Richerby gives a simplified proof which also proves the decidability of the dichotomy criterion. In this talk, we will focus on some of the most interesting ingredients of our algorithm, including a new polynomial-time operation over relations that share a common Mal'tsev polymorphism. We will then describe the framework of utilizing this operation to solve any #CSP efficiently when all three tractability conditions are satisfied. We will also discuss the open problem on whether the tractability conditions here are decidable. Notably all the dichotomies in (a--c) are decidable, with (a) being actually polynomial-time decidable and (c) being decidable in NP. (Based on joint work with Jin-Yi Cai.) |
Slides | [.pdf] |
Reading list |
J.-Y. Cai, X. Chen, and P. Lu
Non-negative Weighted #CSPs: An Effective Complexity Dichotomy [arxiv] arxiv, 2010. |
Speaker | Vladimir Chernyak (Wayne State University, Detroit, MI, USA) |
Title | Planar and Surface Graphical Models Which Are Easy |
Abstract |
We describe a family of statistical mechanics models with binary variables
on planar graphs and demonstrate their equivalence to Gaussian Grassmann
Graphical models (free fermions). Calculation of the partition function
(weighted counting) in the models is easy (of polynomial complexity) as
reduced to evaluation of Pfaffians of matrixes linear in the number of
variables. In particular, this family of models covers Holographic
Algorithms of Valiant and extends our previous work on the Gauge
Transformations. We further extend our approach to the general case of
surface graphs and demonstrate that the partition function is given by an
alternating sum of Pfaffians, labeled by the spinor structures on the
embedding Riemann surface. This is achieved by considering the
self-intersection invariant of immersions, and relating the spinor
structures to the equivalence classes of Kasteleyn orientations on the
so-called extended graph, associated with the original graph.
(Based on joint work with Michael Chertkov.) |
Slides | [.ppt] |
Reading list |
V.Y. Chernyak and M. Chertkov
Planar Graphical Models which are Easy [.pdf] J. Stat. Mech., 2010. L.G. Valiant Holographic algorithms [.pdf] SIAM Journal of Computing, 2008. M. Chertkov and V.Y. Chernyak Loop series for discrete statistical models on graphs [.pdf] J. Stat. Mech., 2006. |
Speaker | Michael Chertkov (Los Alamos National Laboratory, Los Alamos, NM, USA) |
Title | Gauge Transformations and Loop Calculus:
General Theory and Applications to Permanents |
Abstract |
In this talk I first review the loop series/calculus approach of exploring
invariance of the partition function defined for a general graphical model
on a finite graph. Gauge transformations will be introduced, and a variety
of proper choices of gauges, in particular these leading to belief
propagation (of Bethe Peierls -- BP) gauges resulting in the loop
series/calculus, will be discussed.
In the second part of the talk, application of the gauge transformation machinery to the problem of computing permanents of a positive matrix will be first motivated, by a particle image velocimetry technique of the experimental fluid mechanics, and then discussed from the theory stand point. Afterwards, I describe an alternative derivation of the loop series/calculus expression for permanents, and related lower and upper bounds associated with BP, mean-field, and interpolating between the two (generalized BP) approximations. |
Slides | [.pdf] |
Reading list |
M. Chertkov and V.Y. Chernyak
Loop series for discrete statistical models on graphs [.pdf] J. Stat. Mech., 2010. M. Chertkov, L. Kroc, F. Krzakala, M. Vergassola, and L. Zdeborova Inference in particle tracking experiments by passing messages between images [.pdf] PNAS, 2010 Y. Watanabe and M. Chertkov Belief propagation and loop calculus for the permanent of a non-negative matrix [.pdf] J. Physics A, 2010. A.B. Yedidia and M. Chertkov Computing the permanent with belief propagation [arxiv] submitted to the Journal of Machine Learning Research, 2011. |
Speaker | Predrag Cvitanovic (Georgia Institute of Technology, Atlanta, GA, USA) |
Title | Dynamical Zeta Functions: What, Why and What
are They Good For? |
Abstract |
Recent advances in the periodic orbit theory are to equal degree inspired by
a century of separate development of three disparate subjects;
|
Slides | [.pdf] |
Reading list |
P. Cvitanovic et al.
Chaos: Classical and Quantum [.html] In particular the chapters:
P. Cvitanovic Tracks, Lie's, and Exceptional Magic [.pdf] Diagrammatic methods in group theory, at a cyclist pace, in 50 overheads. |
Speaker | Martin Dyer (University of Leeds, Leeds, UK) |
Title | On the Complexity of #CSP |
Abstract |
In the counting constraint satisfaction problem (#CSP), we wish to know how
many ways there are to satisfy a given system of constraints, where the
constraints are defined by a fixed set of relations. Andrei Bulatov has shown
that this class of problems has a dichotomy, depending on the form of the
relations. Each problem is either solvable in polynomial time or is
#P-complete, with no intermediate cases. However, his proof is long, and
requires a good understanding of universal algebra. We will describe a
simpler proof of the result, based on succinct representations for relations
in a class which has a polynomial-time decision algorithm. This leads to a new
criterion for the dichotomy, which is different from Bulatov's, but can be
shown to be equivalent to it.
The talk will explain the significance of the result and outline the proof. It will use no universal algebra, and will include a brief introduction to the counting constraint satisfaction problem. We will conclude the talk by sketching a proof that the dichotomy is decidable, answering the major question left open by Bulatov. (Based on joint work with David Richerby.) |
Slides | [.pdf] |
Reading list |
M. Dyer, L.A. Goldberg, and M. Paterson
On counting homomorphisms to directed acyclic graphs [acm] J. ACM 2007. A. Bulatov The complexity of the Counting Constraint Satisfaction Problem [springerlink] ECCC, 2007. M. Dyer and D. Richerby An Effective Dichotomy for the Counting Constraint Satisfaction Problem [arxiv] arxiv, 2011. |
Speaker | G. David Forney, Jr. (Massachusetts Institute of Technology,
Cambridge, MA, USA) |
Title | Codes on Graphs, Normal Realizations,
and Partition Functions |
Abstract |
The subject of "codes on graphs" has arisen from the dramatic success of
capacity-approaching error-correcting codes such as LDPC codes and turbo
codes, which are typically described by graphical models. Graphical models
for powerful codes necessarily contain cycles. Very few general results are
known for codes on graphs with cycles.
The normal graph duality theorem [Forney 2001] is a powerful general duality result for linear (or finite abelian group) codes described by a "normal" graphical model. The normal assumption is actually not at all restrictive. Recently [Al-Bashabsheh and Mao 2011; Forney 2011] this result has been rederived as a corollary of an even more powerful and general result, the normal factor graph duality theorem, which applies to normal realizations of partition functions (functions in sum-of-products form). The proof involves a generalization of the "holographic transformations" of Valiant et al. Further applications of such transformations are suggested in [Forney and Vontobel, 2011]. It is fruitful to view a code described by a graphical model as a realization of a "system" in the sense of behavioral system theory, in which the usual sequential time axis is replaced by a general graph. Recent results [Forney and Gluesing-Luerssen, 2011] on such system-theoretic properties as trimness, properness, minimality, observability, controllablity and local irreducibility will be reported. (Based on joint work with Heide Gluesing-Luerssen, Yongyi Mao, and Pascal O.~Vontobel.) |
Slides | [.pdf] |
Reading list |
G. D. Forney, Jr.
Codes on graphs: Normal realizations, [ieee] IEEE Trans. Inf. Theory, 2001. G. D. Forney, Jr. Codes on graphs: Duality and MacWilliams identities [ieee] IEEE Trans. Inf. Theory, 2011. G. D. Forney, Jr. and P. O. Vontobel Partition functions of normal factor graphs [arxiv] Proc. Information Theory and Applications Workshop, 2011. |
Speaker | David Gamarnik (Massachusetts Institute of Technology,
Cambridge, MA, USA) |
Title | Interpolation Method and Scaling Limits
in Sparse Random Graphs |
Abstract | We will discuss existence of scaling limits in random graphs of the following flavor: given a sparse random graph, what is the limiting value of the largest cut as the size of the graph diverges to infinity? How many independent sets does it have in the limit? We will discuss the interpolation method as a powerful approach for proving the existence of such scaling limits and connections of these questions with the emerging field of graph limits. Finally, we will discuss applications of the interpolation method to Talagrand's conjecture regarding the limiting volume of a set generated by intersecting random subsets with a unit ball in the euclidian space. Using the interpolation method we give a partial resolution of this conjecture. |
Slides | [.ppt] |
Reading list |
M. Bayati, D. Gamarnik, and P. Tetali
Combinatorial approach to the interpolation method and scaling limits in sparse random graphs [arxiv] arxiv, 2011. S. Franz and M. Leone Replica Bounds for optimization problems and diluted spin systems [springerlink] Journal of Statistical Physics, 2003 E. Abbe and A. Montanari On the concentration of the number of solutions of random satisfiability formulas [arxiv] arxiv, 2010. |
Speaker | Leslie Ann Goldberg (University of Liverpool, Liverpool, UK) |
Title | Approximating the Tutte Polynomial
(and the Potts Partition Function) |
Abstract |
The Tutte polynomial of a graph is a two-variable polynomial that encodes
many interesting properties of the graph, including the partition function
of the Potts model. In this talk, I'll describe joint work with Mark
Jerrum, in which we investigate the complexity of approximately evaluating
the polynomial. If I have time, I'll mention some of our more recent work
about approximating the Tutte polynomial of a binary matroid, which has
consequences for the complexity of approximating the weight enumerator of a
linear code.
(Based on joint work with Mark Jerrum.) |
Slides | [.pdf] |
Reading list |
F. Jaeger, D.L. Vertigan, and D.J.A. Welsh
On the computational complexity of the Jones and Tutte polynomials [cambridge] Math. Proc. Cambridge Philosophical Society, 1990. A.D. Sokal The multivariate Tutte polynomial (alias Potts model) for graphs and matroids [arxiv] arxiv, 2005. L.A. Goldberg, M. Jerrum Approximating the partition function of the ferromagnetic Potts model [arxiv] arxiv, 2010 |
Speaker | Leonid Gurvits (Los Alamos National Laboratory, Los Alamos, NM, USA) |
Title | A New, Entries Dependent, Lower Bound on the
Permanent of Doubly Stochastic Matrices |
Abstract |
The best current deterministic poly-time algorithm, due to [Linial,
Samorodnitsky, Wigderson-1998], approximates the permanent
$\operatorname{per}(P)$ of $n
\times n$ non-negative matrix $P$ with relative factor $e^n$. This result had
been generalized to various quantities besides the permanent, notably to the
Mixed Volume. The algorithm and its generalizations rely on "hard"
lower and
"easy" upper bounds on the permanent of doubly stochastic matrices $A \in
\Omega_n$ (in more general setting on the mixed derivative of so called
doubly-stochastic \emph{(H-stable, strongly log-concave, etc.)}
polynomials). In
the permanental case all the currently known lower bounds
(i.e.\ Falikman-Egorychev--Van Der Waerden (full support); Schrijver
(regular
support), L.G.\ (general sparsity pattern)) depend only on the support of
doubly
stochastic matrix $A$.
The following "entries dependent" bound was very recently proved by L.G.: \begin{equation} \label{raz} \frac{\operatorname{per}(A)}{F(A)} \geq 1, \end{equation} where \begin{equation} \label{raz} F(A) := \prod_{1 \leq i,j \leq n} \big( 1- A(i,j) \big)^{1- A(i,j)}, \ A \in \Omega_n. \end{equation} Inequalty (\ref{raz}) is a corollary of a more general entropic bound which was stated without a proof by P.~Vontobel in 2010: \begin{equation} \log\big( \operatorname{per}(A) \big) \geq \!\!\!\! \sum_{1 \leq i,j \leq n} \big( 1- B(i,j) \big) \log\big(1 - B(i,j) \big) - \!\!\!\! \sum_{1 \leq i,j \leq n} B(i,j) \left(\frac{\log(B(i,j)}{\log(A(i,j)} \right); \ A,B \in \Omega_n. \label{tri} \end{equation} Define $\operatorname{WR}(n)$ as maximum of $\frac{\operatorname{per}(A)}{F(A)}$ over doubly stochastic matrices. Clearly, there is a deterministic poly-time algorithm to approximate the permanent $\operatorname{per}(P)$ of $n \times n$ non-negative matrix $P$ with relative factor $\operatorname{WR}(n)$. We conjecture that $\operatorname{WR}(n) \approx (\sqrt{2})^n$. The main evidence for this conjecture is that it holds for adjacency matrices of regular bipartite graphs. Actually, a weaker conjecture, stated in terms of (\ref{tri}) with the same factor, would also give a deterministic poly-time to approximate the permanent $\operatorname{per}(P)$ with the relative factor $(\sqrt{2})^n$. Our proof of (\ref{tri}) is based on the following A.~Schrijver's inequality, which stayed asleep for 13 years: \begin{equation} \label{dva} \operatorname{per} \Big( \big[ A(i,j) \big(1 -A(i,j) \big) \big]_{1 \leq i,j \leq n} \Big) \geq \prod_{1 \leq i,j \leq n} \big( 1- A(i,j) \big). \end{equation} The inequalities (\ref{tri}) and (\ref{dva}) are actually equivalent. The current proof of (\ref{dva}) is extremely complicated. On the other hand, its much more famous corollary, the Schrijver's sharp lower bound on the number of perfect matchings in regular bipartite graphs, has now very short conceptual proof [L.G.,2005--2006--2008], which gives even a sharper bound and goes far beyond the permanent. Yet, it is not clear at all whether the "hyperbolic polynomials approach" can handle (\ref{raz}) ({\it which is in some sense weaker than (\ref{dva}})): elegant, easy understandable proofs of \emph{deep} permanental results are rare. |
Slides | [.pdf] (slides of a related talk) |
Reading list |
L. Gurvits
Hyperbolic polynomials approach to Van der Waerden/Schrijver-Valiant like conjectures: sharper bounds, simpler proofs and algorithmic applications [acm] Proc. STOC 2006. M. Laurent and A. Schrijver On Leonid Gurvits's proof for permanents [jstor] The American Mathematical Monthly, 2010. A. Schrijver Counting 1-factors in regular bipartite graphs [sciencedirect] J. Comb. Theory, Series B, 1998. L. Gurvits Unharnessing the power of Schrijver's permanental inequality [arixv] arxiv, June 2011. |
Speaker | Alexander Ihler (University of California, Irvine, CA, USA) |
Title | Variational Algorithms for Marginal MAP |
Abstract | Marginal MAP is a notoriously difficult "mixed" inference task that includes aspects of both counting (summation) and optimization. We derive a general variational framework for solving marginal MAP problems, in which we apply analogues of the Bethe, tree-reweighted, and mean field approximations. We then derive a "mixed" message passing algorithm and a convergent CCCP alternative to solve the BP-type approximations. Theoretically, we give conditions under which the decoded solution is a global or local optimum, and obtain novel upper bounds on solutions. Experimentally we demonstrate that our algorithms outperform related approaches. We also show that EM and variational EM comprise a special case of our framework. |
Slides | [.pdf] |
Reading list |
Q. Liu and A. Ihler
Variational algorithms for marginal MAP [.pdf] Proc. Uncertainty in Artificial Intelligence, 2011. |
Speaker | Mark Jerrum (Queen Mary, University of London, London, UK) |
Title | Approximating the Partition Function of the
Ferromagnetic Ising Model |
Abstract |
This talk is concerned with the computational complexity of approximate
counting and, more generally, of estimating partition functions and other
weighted sums. The ferromagnetic Ising model will be used as a running
example to illustrate various aspects of the topic, and give a flavour of
the techniques involved. In the positive direction, I'll recall an old
approximation algorithm for the partition function of the ferromagnetic
Ising model with a consistent applied field (i.e., with the same spin
favoured at every site). In contrast, when the field is allowed to act in
opposite senses at different sites, the partition function appears to become
harder to estimate. I'll describe a complexity-theoretic setting in which it
is possible to provide evidence that the Ising model with an inconsistent
field is hard to approximate.
The Ising model is the special case of the Potts model in which $q$, the number of spins, is $2$. In her talk, Leslie Goldberg will extend the investigation described here to the case $q > 2$. (Based on joint work with Leslie Ann Goldberg and Alistair Sinclair.) |
Slides | [.pdf] |
Reading list |
M. Dyer, L.A. Goldberg, C. Greenhill, and M. Jerrum
On the relative complexity of approximate counting problems [springerlink] Algorithmica, 2004. L.A. Goldberg and M. Jerrum The complexity of ferromagnetic Ising with local fields [cambridge] Combinatorics, Probability, and Computing, 2007. |
Speaker | Navin Kashyap (Indian Institute of Science, Bangalore, India) |
Title | The Treewidth of a Linear Code |
Abstract |
The treewidth of a linear code provides a useful parameterization of the
complexity of optimal (maximum-likelihood) decoding for the code. Formally,
the treewidth of a code is the least constraint complexity among any of its
cycle-free realizations, where the constraint complexity of a realization is
defined to be the maximum dimension of the local constraint codes in the
realization. In this talk, we provide the motivation for defining treewidth,
and discuss its connections with closely-related notions of treewidth defined
for graphs and matroids. These connections allow us to infer, for example,
that computing the treewidth of a linear code is NP-hard. Nevertheless,
treewidth can be explicitly computed for specific families of linear codes,
and we give expressions for the treewidth of maximum-distance separable codes
and Reed-Muller codes.
(Based partly on joint work with Andrew Thangaraj.) |
Slides | [.pdf] |
Reading list |
G.D. Forney, Jr.
Codes on graphs: constraint complexity of cycle-free realizations of linear codes [ieee] IEEE Trans. Inf. Theory, July 2003. N. Kashyap On Minimal Tree Realizations of Linear Codes [ieee] IEEE Trans. Inf. Theory, 2009. N. Kashyap and A. Thangaraj The treewidth of MDS and Reed-Muller codes [arxiv] submitted to IEEE Trans. Inf. Theory, 2011. |
Speaker | Martin Loebl (Charles University, Prague, Czech Republic) |
Title | Complexity of Graph Polynomials |
Abstract | I plan first to survey basic results about calculations of the Ising and dimer partition functions of graphs embedded in a 2-dimensional surface. The complexity grows exponentially with the genus of the surface; this can be proven for the Ising partition function and must be true for the dimer partition function (but it is still open). Is there a hope for an interesting result in a higher dimension? Another operation on graphs relevant to the complexity of these partition functions is gluing: it is inspired by the theories of free fermion. Study of complexity of gluing may bring some new insights. |
Slides | [.pdf] |
Reading list |
M. Loebl
Discrete Mathematics in Statistical Physics: Introductory Lectures [googlebooks] Vieweg and Teubner, 2010. P. Rytir Geometric representations of linear codes [arxiv] arxiv, 2010. M. Loebl and G. Masbaum On the optimality of the Arf invariant formula for graph polynomials [arxiv] Adv. Math, 2010. M. Loebl and P. Somberg Discrete Dirac operators, critical embeddings and Ihara-Selberg functions [arxiv] arxiv, 2009. |
Speaker | Pinyan Lu (Microsoft Research, Shanghai, China) |
Title | Approximate Counting via Correlation Decay
in Spin Systems |
Abstract |
We give the first deterministic fully polynomial-time approximation scheme
(FPTAS) for computing the partition function of a two-state spin system on an
arbitrary graph, when the parameters of the system satisfy the uniqueness
condition on infinite regular trees. This condition is of physical
significance and is believed to be the right boundary between approximable and
inapproximable. The FPTAS is based on the correlation decay technique
introduced by Bandyopadhyay and Gamarnik [SODA 06] and Weitz [STOC 06]. The
classic correlation decay is defined with respect to graph distance. Although
this definition has natural physical meanings, it does not directly support an
FPTAS for systems on arbitrary graphs, because for graphs with unbounded
degrees, the local computation that provides a desirable precision by
correlation decay may take super-polynomial time. We introduce a notion of
\emph{computationally efficient correlation decay}, in which the
correlation decay is measured in a refined metric instead of graph
distance. We use a potential method to analyze the amortized behavior of this
correlation decay and establish a correlation decay that guarantees an
inverse-polynomial precision by polynomial-time local computation. This gives
us an FPTAS for spin systems on arbitrary graphs. This new notion of
correlation decay properly reflects the algorithmic aspect of the spin
systems, and may be used for designing FPTAS for other counting problems.
(Based on joint work with Liang Li and Yitong Yin.) |
Slides | [.pptx] |
Reading list |
D. Weitz
Counting independent sets up to the tree threshold [.pdf] Proc. STOC 2006. H. Guo, P. Lu, and L.G. Valiant The Complexity of Symmetric Boolean Parity Holant Problems [.pdf] Proc. ICALP 2006. A. Sly Computational Transition at the Uniqueness Threshold [arxiv] arxiv, 2010. L. Li, P. Lu, and Y. Yin Approximate counting via correlation decay in spin systems [arxiv] arxiv, 2011. |
Speaker | Yongyi Mao (University of Ottawa, Ottawa, ON, Canada) |
Title | Normal Factor Graphs, Linear Algebra and
Probabilistic Modeling |
Abstract | This talk introduces the notion of normal factor graphs under a new defining semantics, namely, the "exterior function" (or "partition function") semantics. I will show that such a graphical representation, generalizing the notion of Trace Diagrams, can be used as an algebraic tool for establishing various linear-algebraic results diagrammatically. In particular, Valiant's Holant theorem for holographic reduction and Forney's duality theorem for codes defined on graphs can be unified in this framework via the notion of holographic transformations. I will also present the use of normal factor graphs as probabilistic models, where existing graphical models such as factor graphs, convolutional factor graphs and cumulative distribution networks can be shown to reduce from normal factor graph models as special cases. |
Slides | [.pdf] |
Reading list |
Y. Mao and F.R. Kschischang
On factor graphs the Fourier transform [ieeexplore] IEEE Trans. Inf. Theory, 2005. J.C. Huang and B.J. Frey Cumulative distribution networks and the derivative-sum-product algorithm [.pdf] Proc. Conference on Uncertainty in Artificial Intelligence, 2008. E. Peterson Unshackling linear algebra from linear notations [arxiv] arxiv, 2009. |
Speaker | Marc Mezard (LPTMS, CNRS & Universite Paris Sud, France) |
Title | Statistical Physics-Based Reconstruction in
Compressed Sensing |
Abstract |
This talk will present a new class of "seeded" compressed sensing
measurement matrices for which a message passing algorithm achieves exact
reconstruction of the signal even at measurement rates very close to the
optimal one. This construction is motivated by a statistical physics study
of the compressed sensing problem and by the theory of crystal nucleation.
(Based on joint work with Florent Krzakala, Francois Sausset, Yi-Fan Sun, and Lenka Zdeborova.) |
Slides | [.pdf] |
Reading list |
F. Krzakala, M. Mezard, F. Sausset, Y. Sun, L. Zdeborova
Statistical physics-based reconstruction in compressed sensing [arxiv] arxiv, 2011. Y. Kabashima, T. Wadayama, and T. Tanaka Statistical mechanical analysis of a typical reconstruction limit of compressed sensing [arxiv] Proc. ISIT 2010. |
Speaker | Mehdi Molkaraie (ETH Zurich, Zurich, Switzerland) |
Title | Monte Carlo Algorithms for the Partition Function of
Two-Dimensional Channels |
Abstract |
We propose Monte Carlo algorithms to estimate the capacity of noiseless
constrained two-dimensional channels and to estimate the information rates
of noisy constrained two-dimensional source/channel models. Both problems
can be reduced to computing a Monte Carlo estimate of the partition function
of graphical models with cycles. The proposed algorithms use tree-based
Gibbs sampling and multilayer (multitemperature) importance sampling. The
viability of the algorithms is demonstrated by numerical results.
(Based on joint work with Hans-Andrea Loeliger.) |
Slides | [.pdf] |
Reading list |
M. Molkaraie and H.-A. Loeliger
Monte Carlo algorithms for the partition function and information rates of two-dimensional channels [arxiv] arxiv, 2011. G. Sabato and M. Molkaraie Generalized belief propagation for the noiseless capacity and information rates of run-length limited constraints [arxiv] arxiv, 2011. |
Speaker | Andrea Montanari (Stanford University, Stanford, CA, USA) |
Title | Sharp Thresholds in Statistical Learning |
Abstract |
Sharp thresholds are ubiquitous high-dimensional combinatorial
structures. The oldest example is probably the sudden emergence of the giant
component in random graphs, first discovered by Erd{\H o}s and R{\'e}nyi.
More
recently, threshold phenomena have started to play an important role in some
statistical learning and statistical signal processing problems, in part
because of the interest in "compressed sensing." The basic setting is one
in which a large number of noisy observations of a high-dimensional object
are made. As the ratio of the number of observations to the number of
"hidden dimensions" crosses a threshold, our ability to reconstruct the
object increases dramatically. I will discuss several examples of this
phenomenon, and some algorithmic and mathematical ideas that allow to
characterize these threshold phenomena.
(Based on joint work with Mohsen Bayati, David Donoho, Iain Johnstone, and Arian Maleki.) |
Slides | [.pdf] |
Reading list | [not available] |
Speaker | Farzad Parvaresh (Hewlett-Packard Laboratories, Palo Alto, CA, USA) |
Title | Asymptotic Enumeration of Binary Matrices with
Bounded Row and Column Weights |
Abstract |
Consider the set $A_n$ of all $n \times n$ binary matrices in which the
number of $1$’s in each row and column is at most $n/2$. We show that the
redundancy, $n^2 - \log_2 |A_n|$, of this set equals $\rho n - \gamma
\sqrt{n} + O(n^{\epsilon})$, where $\epsilon$ is a small constant greater
than zero, $\rho = 1.42515\ldots$, and $\gamma$ is equal to zero for odd $n$
and is equal to $1.46016\ldots$ for even $n$.
(Based on joint work with Erik Ordentlich and Ron M. Roth.) |
Slides | [.pptx] |
Reading list | [not available] |
Speaker | David Poulin (Universite de Sherbrooke, Sherbrooke, QC, Canada) |
Title | Belief propagation in the Quantum World |
Abstract | This talk will present a brief overview of belief propagation in quantum mechanical setting. I will begin with a short review of the quantum formalism. Then, I will present a selection of problems that can be solved by appropriate generalizations of belief propagation. These include the problem of decoding quantum error correction codes and the estimation of partition functions of a many-body quantum system. I will also present variational duals to quantum belief propagation, and time permitting describe a quantum version of the Hammersley-Clifford theorem. |
Slides | [.pdf] |
Reading list |
D. Poulin and M.B. Hastings
Markov entropy decomposition: a variational dual for quantum belief propagation [prl] Phys. Rev. Lett, 2011. M.S. Leifer and D. Poulin Quantum graphical models and belief propagation [sciencedirect] Annals of Physics, 2008. E. Bilgin and D. Poulin Coarse-grained belief propagation for simulation of interacting quantum systems at all temperatures [aps] Phys. Rev. B, 2010. |
Speaker | Moshe Schwartz (Ben-Gurion University, Beer Sheva, Israel) |
Title | Networks of Relations in the Service of
Constrained Coding |
Abstract |
Constrained systems are widely used today in communication and storage
systems. A two-dimensional constrained system is the set of all
two-dimensional binary arrays which obey a certain constraint. A well-known
example of such constraints is the (0,1)-RLL constraint, i.e., no two adjacent
zeros (horizontally or vertically) in the array. This is equivalent to
counting the number of independent sets in the grid graph. When the diagonal
directions are also included this is equivalent to counting arrays of
non-attacking kings. The logarithm of the number of such arrays, divided by
the area of the array as its sides go to infinity, is called the capacity of
the constraint.
We address the well-known problem of determining the capacity of two-dimensional constrained systems. While the one-dimensional case is well understood to the extent that there are techniques for rigorously deriving the exact capacity, in contrast, computing the exact capacity of a two-dimensional constrained system is still an elusive research goal. We present a rigorous technique that yields an exact capacity of a two-dimensional constrained coding system. In addition, we devise an efficient (polynomial time) algorithm for counting the exact number of constrained arrays of any given size. Our approach is a composition of a number of ideas and techniques: describing the capacity problem as a solution to a counting problem in networks of relations, graph-theoretic tools originally developed in the field of statistical mechanics, techniques for efficiently simulating quantum circuits, as well as ideas from the theory related to the spectral distribution of Toeplitz matrices. Using our technique we derive a closed form solution to the capacity related to the Path-Cover constraint in a two-dimensional triangular array (assigning 1's and 0's to the edges of the triangular grid such that every vertex is incident to either one or two 1's). Path-Cover is a generalization of the well known one-dimensional (0,1)-RLL constraint. The talk is self-contained, and a brief survey of constrained coding will be given, as well as a few open problems motivated by storage systems. (Based on joint work with Jehoshua Bruck.) |
Slides | [.pdf] |
Reading list |
P.W. Kasteleyn
The statistics of dimers on a lattice. I. The number of dimer arrangements on a quadratic lattice [sciencedirect] Physica, 1961. L.G. Valiant Holographic algorithms [ieeecomputersociety] Proc. FOCS 2004. |
Speaker | Jinwoo Shin (Georgia Institute of Technology, Atlanta, GA, USA) |
Title | Improved Mixing Condition on the Grid for
Counting and Sampling Independent Sets |
Abstract |
We study the hard-core model defined on independent sets, where each
independent set $I$ in a graph $G$ is weighted proportionally to
$\lambda^{|I|}$, for
a positive real parameter $\lambda$. For large $\lambda$,
computing the partition
function (namely, the normalizing constant which makes the weighting a
probability distribution on a finite graph) on graphs of maximum degree $D
>2$,
is a well known computationally challenging problem. More concretely, let
$\lambda_c(T_D)$ denote the critical value for the so-called uniqueness
threshold of the hard-core model on the infinite $D$-regular tree; recent
breakthrough results of Dror Weitz (2006) and Allan Sly (2010) have identified
$\lambda_c(T_D)$ as a threshold where the hardness of estimating the above
partition function undergoes a computational transition. We focus on the
well-studied particular case of the square lattice, and provide a new lower
bound for the uniqueness threshold, in particular taking it well above
$\lambda_c(T_4)$. Our technique refines and builds on the tree of
self-avoiding
walks approach of Weitz, resulting in a new technical sufficient criterion
(of
wider applicability) for establishing strong spatial mixing (and hence
uniqueness) for the hard-core model. Our new criterion achieves better bounds
on strong spatial mixing when the graph has extra structure, improving upon
what can be achieved by just using the maximum degree. Applying our technique
to the square lattice, we prove that strong spatial mixing holds for all
$\lambda < 2.3882$, improving upon the work of Weitz that held for
$\lambda < 1.6875$. Our results imply a fully-polynomial deterministic
approximation algorithm for estimating the partition function, as well as
rapid mixing of the associated Glauber dynamics to sample from the hard-core
distribution.
(Based on joint work with Ricardo Restrepo, Prasad Tetali, Eric Vigoda, and Linji Yang.) |
Slides | [.pdf] |
Reading list |
D. Weitz
Counting independent sets up to the tree threshold [.pdf] Proc. STOC 2006. R. Restrepo, J. Shin, P. Tetali, E. Vigoda, and L. Yang Improved mixing condition on the grid for counting and sampling independent sets [arxiv] arxiv, 2011. |
Speaker | Alistair Sinclair (University of California, Berkeley, CA, USA) |
Title | Permanents, Determinants, and Commutativity |
Abstract |
The permanent and determinant of a matrix, while syntactically very similar,
are computationally very different: as is well known, there are a number of
efficient algorithms for computing the determinant, but the permanent is
#P-hard and hence presumably intractable. It is a long-standing challenge
in theoretical computer science to explain this phenomenon. In this talk we
focus on one aspect of this question, namely the role played by
commutativity of the matrix entries. (Known algorithms for computing
determinants all apparently rely crucially on commutativity.)
On the one hand, we will show that the ability to compute the determinant over certain non-commutative algebras would lead to a very simple and efficient approximation algorithm for the permanent. This serves as motivation for the second part of the talk, in which we will show that computing the determinant over "most" non-commutative algebras is as hard as computing the permanent exactly. In fact, we will give a dichotomy theorem for efficient computation of the determinant over non-commutative algebras in terms of a structural property of the algebras. (Based on joint work with Steve Chien, Praladh Harsha, Lars Rasmussen, and Srikanth Srinivasan.) |
Slides | [not available yet] |
Reading list |
S. Chien, P. Harsha, A. Sinclair, and S. Srinivasan
Almost settling the hardness of noncommutative determinant [arxiv] Proc. STOC 2011. S. Chien, L. Rasmussen, and A. Sinclair Clifford algebras and approximating the permanent [sciencedirect] J. Computer and Systems Sciences, 2003 |
Speaker | Leslie G. Valiant (Harvard University, Cambridge, MA, USA) |
Title | Holographic Algorithms |
Abstract | We will describe the basic ideas behind holographic algorithms, and how these are used for deriving both positive and negative results about the computational complexity of counting problems. We shall also discuss some background in computational complexity theory that motivates this approach. |
Slides | [not available yet] |
Reading list |
L.G. Valiant
Holographic Algorithms [.pdf] Proc. STOC 2004. L.G. Valiant Accidental Algorithms [.pdf] Proc. FOCS 2006. J.-Y. Cai and P. Lu Holographic algorithms: from art to science [acm] Proc. STOC 2007. |
Speaker | Umesh Vazirani (University of California, Berkeley, CA, USA) |
Title | Quantum Description Complexity |
Abstract | The classical description of $n$-particle quantum systems scales exponentially in $n$. Are there general conditions under which such states can be succinctly described? Prime candidates for this are the ground states of gapped local Hamiltonians. The primary cause for optimism is the long conjectured area law, which bounds the amount of entanglement in such states. I will describe recent progress towards proving the area law, as well as tensor network based techniques for succinctly describing quantum states with limited entanglement. |
Slides | [not available yet] |
Reading list | [not available] |
Speaker | Pascal O. Vontobel (Hewlett-Packard Laboratories, Palo Alto, CA, USA) |
Title | Should We Believe in Numbers Computed by
Loopy Belief Propagation? |
Abstract |
Loopy belief propagation (LBP) has been very popular in the last fifteen years
in the area of coded data transmission (and beyond) because of its low
implementation complexity and its outstanding performance. In this talk, we
discuss an analysis technique that allows one to obtain a better understanding
of why LBP works so well for some problems, yet also has some limitations.
Although this LBP analysis technique is broadly applicable, to be concrete, we present it in a specific context. Namely, we focus on the use of LBP for estimating the sum of weighted perfect matchings in a complete bipartite graph (i.e., permanents), a setup that subsumes many interesting counting problems. Common wisdom would suggest that LBP does not work well in this context because the underlying graph is dense and has many short cycles. However, it turns out that LBP gives a very valuable estimate for this counting problem, and the above-mentioned analysis technique allows us to see why this is so by providing a combinatorial characterization of the LBP-based estimate: on the one hand, this combinatorial characterization gives insights why the LBP-based estimate is useful, on the other hand, it shows why the LBP-based estimate differs in general from the correct value. |
Slides | [.pdf] |
Reading list |
P.O. Vontobel
The Bethe permanent of a non-negative matrix [arxiv] arxiv, 2011. P.O. Vontobel Counting in graph covers: A combinatorial characterization of the Bethe entropy function [arxiv] arxiv, 2010. L. Gurvits Unharnessing the power of Schrijver's permanental inequality [arixv] arxiv, June 2011. |
Speaker | Martin J. Wainwright (University of California, Berkeley, CA, USA) |
Title | Learning in Graphical Models: Missing Data and Rigorous
Guarantees with Non-Convexity |
Abstract |
Given a black box that generates samples from a graphical model, how to
efficiently estimate the structure and parameters of the underlying graph?
This learning problem underlies many applications of graphical models, and
has attracted signficant attention in recent years. As we discuss, when
samples are fully observed without corruption, there are now a number of
neighborhood-based algorithms with attractive practical performance and
theoretical guarantees. We then turn to problems with noisy or partially
missing data, and propose a modified neighborhood-based approach that
involves solving non-convex programs. On the statistical side, we provide
non-asymptotic bounds that hold with high probability for the cases of
noisy, missing, and/or dependent data. On the computational side, we prove
that under the same types of conditions required for statistical
consistency, a simple projected gradient descent algorithm is guaranteed to
converge at a geometric rate to a near-global minimizer. This yields a
polynomial-time algorithm that matches information-theoretic lower
bounds up to constant factors.
(Based on joint work with Po-Ling Loh.) |
Slides | [.pdf] |
Reading list | [not available] |
Speaker | Yair Weiss (Hebrew University, Jerusalem, Israel) |
Title | Convexity: what is it good for? |
Abstract | Perhaps the most exciting recent development in message-passing algorithms on graphs is the discovery of algorithms such as tree-reweighted message passing (TRW) which are based on {\em convex} approximations to the free energy. Both the sum-product version and the max-product version of these algorithms have highly desirable theoretical properties and a fascinating connection to linear programming relaxations. Despite these properties, in many applications practitioners still prefer using the "classical" BP algorithm which uses a {\em non-convex} approximation to the free energy. I will illustrate this disconnect between theory and practice on a number of applications from my lab, and also discuss some recent theoretical work that tries to adapt some of the results on convex algorithms to the classical BP algorithm. |
Slides | [.pdf] |
Reading list | [not available] |
Speaker | Jonathan Yedidia (Disney Research, Cambridge, MA, USA) |
Title | The Alternating Direction Method of Multipliers
as a Message-Passing Algorithm |
Abstract | I will describe how the alternating direction method of multipliers (ADMM) can be used to construct a simple message-passing algorithm to search for the optimal configuration on any factor graph, where the factor graph encodes an optimization problem with continuous and/or discrete variables and with arbitrary cost functions. The algorithm provably converges to the correct answer for all problems where the factor graph has only convex cost functions, and is a powerful heuristic for factor graphs with non-convex costs. For the special case when the cost functions are all hard constraints, the algorithm reduces to the "Divide and Concur" (DC) algorithm. ADMM and DC algorithms use similar memory and data structures compared to belief propagation (BP) algorithms, but they have the important advantage compared to BP that they deal efficiently with continuous variables, and for DC algorithms, one can guarantee that all fixed points of the dynamics are correct solutions, even for non-convex problems. On the other hand ADMM and DC algorithms are not inherently scale-invariant, which is related to their slow convergence for some problems. |
Slides | [.pdf] |
Reading list |
J. Yedidia
Message-passing algorithms for inference and optimization: "Belief Propagation" and "Divide and Concur" [.pdf] subm. to Journal of Statistical Physics, 2011. V. Elser, I. Rankenburg, and P. Thibault Searching with iterated maps [pnas] PNAS 2007. S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein Distributed optimization and statistical learning via the alternating direction method of multipliers [html] Foundations and Trends in Machine Learning, 2011. |