# Supplementary Material for Workshop on «Counting, Inference, and Optimization on Graphs»

#### (Princeton University, November 2–5, 2011)

• Reading list entries are recommended by speakers and/or workshop co-organizers.
• Use tools like google/scholar to obtain alternative links to papers.
• Videos for some of the talks are available at the workshop's main web page.

Speakers:

 Speaker Anima Anandkumar (University of California, Irvine, CA, USA) Title High-Dimensional Graphical Model Selection: Tractable Graph Families and Regimes Abstract Capturing complex interactions among a large set of variables is a challenging task. Probabilistic graphical models or Markov random fields provide a graph-based framework for capturing such dependencies. Graph estimation is an important task, since it reveals important relationships among the variables. I will present a unified view of graph estimation and propose a simple local algorithm for graph estimation using only low-order statistics of the data. We establish that the algorithm has consistent graph estimation with low sample complexity for a class of graphical models satisfying certain structural and parameter criteria. We explicitly characterize these model classes and point out interesting relationships between the graph structure and the parameter regimes, required for tractable learning. Many graph families such as the classical Erd{\H o}s--R{\'e}nyi random graphs, random regular graphs, and the small-world graphs can be learnt efficiently under our framework. Slides [.pdf] Reading list A. Anandkumar, V.Y.F Tan, and A.S. Willsky High-dimensional structure learning of Ising models: tractable graph families [.pdf] preprint, June 2011. D. Weitz Counting independent sets up to the tree threshold [.pdf] Proc. STOC 2006. A. Anandkumar, V.Y.F Tan, and A.S. Willsky High-dimensional Gaussian graphical model selection: tractable graph families [.pdf] preprint, 2011.

 Speaker Alexander Barvinok (University of Michigan, Ann Arbor, MI, USA) Title Counting Integer Points in Polyhedra Abstract I'll attempt to review some general approaches to efficient (polynomial time) counting of integer points in polyhedra. In fixed dimension, there are exact methods based on "short rational generating functions." If the dimension is allowed to vary, exact counting is almost always out of the question, and we have to settle for approximate counting, which is often based on probabilistic ideas, although the resulting methods can be deterministic. Slides [.pdf] Reading list A. Barvinok The complexity of generating functions for integer points in polyhedra and beyond [.pdf] Proceedings of the International Congress of Mathematicians, 2006. A. Barvinok and J.A. Hartigan Maximum entropy Gaussian approximations for the number of integer points and volumes of polytopes [.pdf] Advances in Applied Mathematics, 2010.

 Speaker Jin-Yi Cai (University of Wisconsin, Madison, WI, USA) Title Complexity Dichotomy Theorems for Counting Problems Abstract I will describe several complexity dichotomy theorems for counting problems. Lov\'asz (1967) defined graph homomorphism to encompass a very general class of graph properties. The computational problem can be stated as follows: Given an $m \times m$ symmetric matrix $A$ over the complex field, compute the function $Z_A(\cdot)$, where for an arbitrary input graph~$G$, $Z_A(G) = \sum_{\xi:V(G)\rightarrow [m]} \ \prod_{(u,v)\in E(G)} A_{\xi(u),\xi(v)}.$ Recently a complete dichotomy theorem has been achieved for this problem. I will give an introduction to this proof. I also plan to discuss a dichotomy theorem which characterizes all solvable planar Holant and CSP problems over the Boolean domain. This assumes arbitrary symmetric real-valued local constraint functions. The interesting characterization is that, within the stated framework, Valiant's holographic algorithms with matchgates provide a universal methodology to all counting problems which are #P-hard on general graphs, but solvable on planar graphs in polynomial time. (Based on joint work with Xi Chen, Pinyan Lu, and Mingji Xia.) Slides [.pdf] Reading list J.-Y. Cai, X. Chen, and P. Lu Graph Homomorphisms with Complex Values: A Dichotomy Theorem [arxiv] arxiv, 2009. J.-Y. Cai, P. Lu, M. Xia Holographic Algorithms with Matchgates Capture Precisely Tractable Planar #CSP [arxiv] arxiv, 2010.

 Speaker Xi Chen (Columbia University, New York, NY, USA) Title Complexity of Counting CSP with Complex Weights Abstract We prove a complexity dichotomy theorem for all Counting Constraint Satisfaction Problems (#CSP) with complex weights. We give three conditions for tractability. Let $F$ be any finite set of complex functions, then we show that the #CSP defined by $F$ is solvable in polynomial time if all three conditions are satisfied; and is #P-hard otherwise. Our dichotomy generalizes a long series of important results on counting problems: (a) the problem of counting graph homomorphisms is the special case when there is a single symmetric binary function in $F$, for which a dichotomy is known for all complex functions; (b) the problem of counting directed graph homomorphisms is the special case when there is a single not necessarily symmetric binary function in $F$, for which a dichotomy is only known for non-negative functions before; and (c) the standard form of #CSP is when all functions in $F$ are 0--1 valued, for which a dichotomy is given by Bulatov. Dyer and Richerby gives a simplified proof which also proves the decidability of the dichotomy criterion. In this talk, we will focus on some of the most interesting ingredients of our algorithm, including a new polynomial-time operation over relations that share a common Mal'tsev polymorphism. We will then describe the framework of utilizing this operation to solve any #CSP efficiently when all three tractability conditions are satisfied. We will also discuss the open problem on whether the tractability conditions here are decidable. Notably all the dichotomies in (a--c) are decidable, with (a) being actually polynomial-time decidable and (c) being decidable in NP. (Based on joint work with Jin-Yi Cai.) Slides [.pdf] Reading list J.-Y. Cai, X. Chen, and P. Lu Non-negative Weighted #CSPs: An Effective Complexity Dichotomy [arxiv] arxiv, 2010.

 Speaker Vladimir Chernyak (Wayne State University, Detroit, MI, USA) Title Planar and Surface Graphical Models Which Are Easy Abstract We describe a family of statistical mechanics models with binary variables on planar graphs and demonstrate their equivalence to Gaussian Grassmann Graphical models (free fermions). Calculation of the partition function (weighted counting) in the models is easy (of polynomial complexity) as reduced to evaluation of Pfaffians of matrixes linear in the number of variables. In particular, this family of models covers Holographic Algorithms of Valiant and extends our previous work on the Gauge Transformations. We further extend our approach to the general case of surface graphs and demonstrate that the partition function is given by an alternating sum of Pfaffians, labeled by the spinor structures on the embedding Riemann surface. This is achieved by considering the self-intersection invariant of immersions, and relating the spinor structures to the equivalence classes of Kasteleyn orientations on the so-called extended graph, associated with the original graph. (Based on joint work with Michael Chertkov.) Slides [.ppt] Reading list V.Y. Chernyak and M. Chertkov Planar Graphical Models which are Easy [.pdf] J. Stat. Mech., 2010. L.G. Valiant Holographic algorithms [.pdf] SIAM Journal of Computing, 2008. M. Chertkov and V.Y. Chernyak Loop series for discrete statistical models on graphs [.pdf] J. Stat. Mech., 2006.

 Speaker Michael Chertkov (Los Alamos National Laboratory, Los Alamos, NM, USA) Title Gauge Transformations and Loop Calculus: General Theory and Applications to Permanents Abstract In this talk I first review the loop series/calculus approach of exploring invariance of the partition function defined for a general graphical model on a finite graph. Gauge transformations will be introduced, and a variety of proper choices of gauges, in particular these leading to belief propagation (of Bethe Peierls -- BP) gauges resulting in the loop series/calculus, will be discussed. In the second part of the talk, application of the gauge transformation machinery to the problem of computing permanents of a positive matrix will be first motivated, by a particle image velocimetry technique of the experimental fluid mechanics, and then discussed from the theory stand point. Afterwards, I describe an alternative derivation of the loop series/calculus expression for permanents, and related lower and upper bounds associated with BP, mean-field, and interpolating between the two (generalized BP) approximations. Slides [.pdf] Reading list M. Chertkov and V.Y. Chernyak Loop series for discrete statistical models on graphs [.pdf] J. Stat. Mech., 2010. M. Chertkov, L. Kroc, F. Krzakala, M. Vergassola, and L. Zdeborova Inference in particle tracking experiments by passing messages between images [.pdf] PNAS, 2010 Y. Watanabe and M. Chertkov Belief propagation and loop calculus for the permanent of a non-negative matrix [.pdf] J. Physics A, 2010. A.B. Yedidia and M. Chertkov Computing the permanent with belief propagation [arxiv] submitted to the Journal of Machine Learning Research, 2011.

 Speaker Predrag Cvitanovic (Georgia Institute of Technology, Atlanta, GA, USA) Title Dynamical Zeta Functions: What, Why and What are They Good For? Abstract Recent advances in the periodic orbit theory are to equal degree inspired by a century of separate development of three disparate subjects; classical chaotic dynamics, initiated by Poincare and put on its modern footing by Smale, Ruelle, and others; quantum theory initiated by Bohr, with the modern `chaotic' formulation by Gutzwiller; and analytic number theory initiated by Riemann and formulated as a spectral problem by Selberg. The three separate roads arrive at nearly identical trace formulas, zeta functions, and Fredholm determinants. The connection between dynamics and number theory arises from Selberg's observation that description of geodesic motion and wave mechanics on spaces of constant negative curvature is essentially a number-theoretic problem. In all of the above, the trace formulas play a role for nonlinear geometries analogous to that the Fourier transform plays for the circle: they relate the spectrum of lengths (local dynamics) to the spectrum of eigenvalues (global averages). We introduce the theory by counting distinct trajectories of a chaotic dynamical system (topological zeta function), and then show how to weigh them and sum them in order to predict long-time chaotic averages (dynamical zeta function). Slides [.pdf] Reading list P. Cvitanovic et al. Chaos: Classical and Quantum [.html] In particular the chapters: P. Cvitanovic Tracks, Lie's, and Exceptional Magic [.pdf] Diagrammatic methods in group theory, at a cyclist pace, in 50 overheads.

 Speaker Martin Dyer (University of Leeds, Leeds, UK) Title On the Complexity of #CSP Abstract In the counting constraint satisfaction problem (#CSP), we wish to know how many ways there are to satisfy a given system of constraints, where the constraints are defined by a fixed set of relations. Andrei Bulatov has shown that this class of problems has a dichotomy, depending on the form of the relations. Each problem is either solvable in polynomial time or is #P-complete, with no intermediate cases. However, his proof is long, and requires a good understanding of universal algebra. We will describe a simpler proof of the result, based on succinct representations for relations in a class which has a polynomial-time decision algorithm. This leads to a new criterion for the dichotomy, which is different from Bulatov's, but can be shown to be equivalent to it. The talk will explain the significance of the result and outline the proof. It will use no universal algebra, and will include a brief introduction to the counting constraint satisfaction problem. We will conclude the talk by sketching a proof that the dichotomy is decidable, answering the major question left open by Bulatov. (Based on joint work with David Richerby.) Slides [.pdf] Reading list M. Dyer, L.A. Goldberg, and M. Paterson On counting homomorphisms to directed acyclic graphs [acm] J. ACM 2007. A. Bulatov The complexity of the Counting Constraint Satisfaction Problem [springerlink] ECCC, 2007. M. Dyer and D. Richerby An Effective Dichotomy for the Counting Constraint Satisfaction Problem [arxiv] arxiv, 2011.

 Speaker G. David Forney, Jr. (Massachusetts Institute of Technology, Cambridge, MA, USA) Title Codes on Graphs, Normal Realizations, and Partition Functions Abstract The subject of "codes on graphs" has arisen from the dramatic success of capacity-approaching error-correcting codes such as LDPC codes and turbo codes, which are typically described by graphical models. Graphical models for powerful codes necessarily contain cycles. Very few general results are known for codes on graphs with cycles. The normal graph duality theorem [Forney 2001] is a powerful general duality result for linear (or finite abelian group) codes described by a "normal" graphical model. The normal assumption is actually not at all restrictive. Recently [Al-Bashabsheh and Mao 2011; Forney 2011] this result has been rederived as a corollary of an even more powerful and general result, the normal factor graph duality theorem, which applies to normal realizations of partition functions (functions in sum-of-products form). The proof involves a generalization of the "holographic transformations" of Valiant et al. Further applications of such transformations are suggested in [Forney and Vontobel, 2011]. It is fruitful to view a code described by a graphical model as a realization of a "system" in the sense of behavioral system theory, in which the usual sequential time axis is replaced by a general graph. Recent results [Forney and Gluesing-Luerssen, 2011] on such system-theoretic properties as trimness, properness, minimality, observability, controllablity and local irreducibility will be reported. (Based on joint work with Heide Gluesing-Luerssen, Yongyi Mao, and Pascal O.~Vontobel.) Slides [.pdf] Reading list G. D. Forney, Jr. Codes on graphs: Normal realizations, [ieee] IEEE Trans. Inf. Theory, 2001. G. D. Forney, Jr. Codes on graphs: Duality and MacWilliams identities [ieee] IEEE Trans. Inf. Theory, 2011. G. D. Forney, Jr. and P. O. Vontobel Partition functions of normal factor graphs [arxiv] Proc. Information Theory and Applications Workshop, 2011.

 Speaker David Gamarnik (Massachusetts Institute of Technology, Cambridge, MA, USA) Title Interpolation Method and Scaling Limits in Sparse Random Graphs Abstract We will discuss existence of scaling limits in random graphs of the following flavor: given a sparse random graph, what is the limiting value of the largest cut as the size of the graph diverges to infinity? How many independent sets does it have in the limit? We will discuss the interpolation method as a powerful approach for proving the existence of such scaling limits and connections of these questions with the emerging field of graph limits. Finally, we will discuss applications of the interpolation method to Talagrand's conjecture regarding the limiting volume of a set generated by intersecting random subsets with a unit ball in the euclidian space. Using the interpolation method we give a partial resolution of this conjecture. Slides [.ppt] Reading list M. Bayati, D. Gamarnik, and P. Tetali Combinatorial approach to the interpolation method and scaling limits in sparse random graphs [arxiv] arxiv, 2011. S. Franz and M. Leone Replica Bounds for optimization problems and diluted spin systems [springerlink] Journal of Statistical Physics, 2003 E. Abbe and A. Montanari On the concentration of the number of solutions of random satisfiability formulas [arxiv] arxiv, 2010.

 Speaker Leslie Ann Goldberg (University of Liverpool, Liverpool, UK) Title Approximating the Tutte Polynomial (and the Potts Partition Function) Abstract The Tutte polynomial of a graph is a two-variable polynomial that encodes many interesting properties of the graph, including the partition function of the Potts model. In this talk, I'll describe joint work with Mark Jerrum, in which we investigate the complexity of approximately evaluating the polynomial. If I have time, I'll mention some of our more recent work about approximating the Tutte polynomial of a binary matroid, which has consequences for the complexity of approximating the weight enumerator of a linear code. (Based on joint work with Mark Jerrum.) Slides [.pdf] Reading list F. Jaeger, D.L. Vertigan, and D.J.A. Welsh On the computational complexity of the Jones and Tutte polynomials [cambridge] Math. Proc. Cambridge Philosophical Society, 1990. A.D. Sokal The multivariate Tutte polynomial (alias Potts model) for graphs and matroids [arxiv] arxiv, 2005. L.A. Goldberg, M. Jerrum Approximating the partition function of the ferromagnetic Potts model [arxiv] arxiv, 2010

 Speaker Leonid Gurvits (Los Alamos National Laboratory, Los Alamos, NM, USA) Title A New, Entries Dependent, Lower Bound on the Permanent of Doubly Stochastic Matrices Abstract The best current deterministic poly-time algorithm, due to [Linial, Samorodnitsky, Wigderson-1998], approximates the permanent $\operatorname{per}(P)$ of $n \times n$ non-negative matrix $P$ with relative factor $e^n$. This result had been generalized to various quantities besides the permanent, notably to the Mixed Volume. The algorithm and its generalizations rely on "hard" lower and "easy" upper bounds on the permanent of doubly stochastic matrices $A \in \Omega_n$ (in more general setting on the mixed derivative of so called doubly-stochastic \emph{(H-stable, strongly log-concave, etc.)} polynomials). In the permanental case all the currently known lower bounds (i.e.\ Falikman-Egorychev--Van Der Waerden (full support); Schrijver (regular support), L.G.\ (general sparsity pattern)) depend only on the support of doubly stochastic matrix $A$. The following "entries dependent" bound was very recently proved by L.G.: $$\label{raz} \frac{\operatorname{per}(A)}{F(A)} \geq 1,$$ where $$\label{raz} F(A) := \prod_{1 \leq i,j \leq n} \big( 1- A(i,j) \big)^{1- A(i,j)}, \ A \in \Omega_n.$$ Inequalty (\ref{raz}) is a corollary of a more general entropic bound which was stated without a proof by P.~Vontobel in 2010: $$\log\big( \operatorname{per}(A) \big) \geq \!\!\!\! \sum_{1 \leq i,j \leq n} \big( 1- B(i,j) \big) \log\big(1 - B(i,j) \big) - \!\!\!\! \sum_{1 \leq i,j \leq n} B(i,j) \left(\frac{\log(B(i,j)}{\log(A(i,j)} \right); \ A,B \in \Omega_n. \label{tri}$$ Define $\operatorname{WR}(n)$ as maximum of $\frac{\operatorname{per}(A)}{F(A)}$ over doubly stochastic matrices. Clearly, there is a deterministic poly-time algorithm to approximate the permanent $\operatorname{per}(P)$ of $n \times n$ non-negative matrix $P$ with relative factor $\operatorname{WR}(n)$. We conjecture that $\operatorname{WR}(n) \approx (\sqrt{2})^n$. The main evidence for this conjecture is that it holds for adjacency matrices of regular bipartite graphs. Actually, a weaker conjecture, stated in terms of (\ref{tri}) with the same factor, would also give a deterministic poly-time to approximate the permanent $\operatorname{per}(P)$ with the relative factor $(\sqrt{2})^n$. Our proof of (\ref{tri}) is based on the following A.~Schrijver's inequality, which stayed asleep for 13 years: $$\label{dva} \operatorname{per} \Big( \big[ A(i,j) \big(1 -A(i,j) \big) \big]_{1 \leq i,j \leq n} \Big) \geq \prod_{1 \leq i,j \leq n} \big( 1- A(i,j) \big).$$ The inequalities (\ref{tri}) and (\ref{dva}) are actually equivalent. The current proof of (\ref{dva}) is extremely complicated. On the other hand, its much more famous corollary, the Schrijver's sharp lower bound on the number of perfect matchings in regular bipartite graphs, has now very short conceptual proof [L.G.,2005--2006--2008], which gives even a sharper bound and goes far beyond the permanent. Yet, it is not clear at all whether the "hyperbolic polynomials approach" can handle (\ref{raz}) ({\it which is in some sense weaker than (\ref{dva}})): elegant, easy understandable proofs of \emph{deep} permanental results are rare. Slides [.pdf] (slides of a related talk) Reading list L. Gurvits Hyperbolic polynomials approach to Van der Waerden/Schrijver-Valiant like conjectures: sharper bounds, simpler proofs and algorithmic applications [acm] Proc. STOC 2006. M. Laurent and A. Schrijver On Leonid Gurvits's proof for permanents [jstor] The American Mathematical Monthly, 2010. A. Schrijver Counting 1-factors in regular bipartite graphs [sciencedirect] J. Comb. Theory, Series B, 1998. L. Gurvits Unharnessing the power of Schrijver's permanental inequality [arixv] arxiv, June 2011.

 Speaker Alexander Ihler (University of California, Irvine, CA, USA) Title Variational Algorithms for Marginal MAP Abstract Marginal MAP is a notoriously difficult "mixed" inference task that includes aspects of both counting (summation) and optimization. We derive a general variational framework for solving marginal MAP problems, in which we apply analogues of the Bethe, tree-reweighted, and mean field approximations. We then derive a "mixed" message passing algorithm and a convergent CCCP alternative to solve the BP-type approximations. Theoretically, we give conditions under which the decoded solution is a global or local optimum, and obtain novel upper bounds on solutions. Experimentally we demonstrate that our algorithms outperform related approaches. We also show that EM and variational EM comprise a special case of our framework. Slides [.pdf] Reading list Q. Liu and A. Ihler Variational algorithms for marginal MAP [.pdf] Proc. Uncertainty in Artificial Intelligence, 2011.

 Speaker Mark Jerrum (Queen Mary, University of London, London, UK) Title Approximating the Partition Function of the Ferromagnetic Ising Model Abstract This talk is concerned with the computational complexity of approximate counting and, more generally, of estimating partition functions and other weighted sums. The ferromagnetic Ising model will be used as a running example to illustrate various aspects of the topic, and give a flavour of the techniques involved. In the positive direction, I'll recall an old approximation algorithm for the partition function of the ferromagnetic Ising model with a consistent applied field (i.e., with the same spin favoured at every site). In contrast, when the field is allowed to act in opposite senses at different sites, the partition function appears to become harder to estimate. I'll describe a complexity-theoretic setting in which it is possible to provide evidence that the Ising model with an inconsistent field is hard to approximate. The Ising model is the special case of the Potts model in which $q$, the number of spins, is $2$. In her talk, Leslie Goldberg will extend the investigation described here to the case $q > 2$. (Based on joint work with Leslie Ann Goldberg and Alistair Sinclair.) Slides [.pdf] Reading list M. Dyer, L.A. Goldberg, C. Greenhill, and M. Jerrum On the relative complexity of approximate counting problems [springerlink] Algorithmica, 2004. L.A. Goldberg and M. Jerrum The complexity of ferromagnetic Ising with local fields [cambridge] Combinatorics, Probability, and Computing, 2007.

 Speaker Navin Kashyap (Indian Institute of Science, Bangalore, India) Title The Treewidth of a Linear Code Abstract The treewidth of a linear code provides a useful parameterization of the complexity of optimal (maximum-likelihood) decoding for the code. Formally, the treewidth of a code is the least constraint complexity among any of its cycle-free realizations, where the constraint complexity of a realization is defined to be the maximum dimension of the local constraint codes in the realization. In this talk, we provide the motivation for defining treewidth, and discuss its connections with closely-related notions of treewidth defined for graphs and matroids. These connections allow us to infer, for example, that computing the treewidth of a linear code is NP-hard. Nevertheless, treewidth can be explicitly computed for specific families of linear codes, and we give expressions for the treewidth of maximum-distance separable codes and Reed-Muller codes. (Based partly on joint work with Andrew Thangaraj.) Slides [.pdf] Reading list G.D. Forney, Jr. Codes on graphs: constraint complexity of cycle-free realizations of linear codes [ieee] IEEE Trans. Inf. Theory, July 2003. N. Kashyap On Minimal Tree Realizations of Linear Codes [ieee] IEEE Trans. Inf. Theory, 2009. N. Kashyap and A. Thangaraj The treewidth of MDS and Reed-Muller codes [arxiv] submitted to IEEE Trans. Inf. Theory, 2011.

 Speaker Martin Loebl (Charles University, Prague, Czech Republic) Title Complexity of Graph Polynomials Abstract I plan first to survey basic results about calculations of the Ising and dimer partition functions of graphs embedded in a 2-dimensional surface. The complexity grows exponentially with the genus of the surface; this can be proven for the Ising partition function and must be true for the dimer partition function (but it is still open). Is there a hope for an interesting result in a higher dimension? Another operation on graphs relevant to the complexity of these partition functions is gluing: it is inspired by the theories of free fermion. Study of complexity of gluing may bring some new insights. Slides [.pdf] Reading list M. Loebl Discrete Mathematics in Statistical Physics: Introductory Lectures [googlebooks] Vieweg and Teubner, 2010. P. Rytir Geometric representations of linear codes [arxiv] arxiv, 2010. M. Loebl and G. Masbaum On the optimality of the Arf invariant formula for graph polynomials [arxiv] Adv. Math, 2010. M. Loebl and P. Somberg Discrete Dirac operators, critical embeddings and Ihara-Selberg functions [arxiv] arxiv, 2009.

 Speaker Pinyan Lu (Microsoft Research, Shanghai, China) Title Approximate Counting via Correlation Decay in Spin Systems Abstract We give the first deterministic fully polynomial-time approximation scheme (FPTAS) for computing the partition function of a two-state spin system on an arbitrary graph, when the parameters of the system satisfy the uniqueness condition on infinite regular trees. This condition is of physical significance and is believed to be the right boundary between approximable and inapproximable. The FPTAS is based on the correlation decay technique introduced by Bandyopadhyay and Gamarnik [SODA 06] and Weitz [STOC 06]. The classic correlation decay is defined with respect to graph distance. Although this definition has natural physical meanings, it does not directly support an FPTAS for systems on arbitrary graphs, because for graphs with unbounded degrees, the local computation that provides a desirable precision by correlation decay may take super-polynomial time. We introduce a notion of \emph{computationally efficient correlation decay}, in which the correlation decay is measured in a refined metric instead of graph distance. We use a potential method to analyze the amortized behavior of this correlation decay and establish a correlation decay that guarantees an inverse-polynomial precision by polynomial-time local computation. This gives us an FPTAS for spin systems on arbitrary graphs. This new notion of correlation decay properly reflects the algorithmic aspect of the spin systems, and may be used for designing FPTAS for other counting problems. (Based on joint work with Liang Li and Yitong Yin.) Slides [.pptx] Reading list D. Weitz Counting independent sets up to the tree threshold [.pdf] Proc. STOC 2006. H. Guo, P. Lu, and L.G. Valiant The Complexity of Symmetric Boolean Parity Holant Problems [.pdf] Proc. ICALP 2006. A. Sly Computational Transition at the Uniqueness Threshold [arxiv] arxiv, 2010. L. Li, P. Lu, and Y. Yin Approximate counting via correlation decay in spin systems [arxiv] arxiv, 2011.

 Speaker Yongyi Mao (University of Ottawa, Ottawa, ON, Canada) Title Normal Factor Graphs, Linear Algebra and Probabilistic Modeling Abstract This talk introduces the notion of normal factor graphs under a new defining semantics, namely, the "exterior function" (or "partition function") semantics. I will show that such a graphical representation, generalizing the notion of Trace Diagrams, can be used as an algebraic tool for establishing various linear-algebraic results diagrammatically. In particular, Valiant's Holant theorem for holographic reduction and Forney's duality theorem for codes defined on graphs can be unified in this framework via the notion of holographic transformations. I will also present the use of normal factor graphs as probabilistic models, where existing graphical models such as factor graphs, convolutional factor graphs and cumulative distribution networks can be shown to reduce from normal factor graph models as special cases. Slides [.pdf] Reading list Y. Mao and F.R. Kschischang On factor graphs the Fourier transform [ieeexplore] IEEE Trans. Inf. Theory, 2005. J.C. Huang and B.J. Frey Cumulative distribution networks and the derivative-sum-product algorithm [.pdf] Proc. Conference on Uncertainty in Artificial Intelligence, 2008. E. Peterson Unshackling linear algebra from linear notations [arxiv] arxiv, 2009.

 Speaker Marc Mezard (LPTMS, CNRS & Universite Paris Sud, France) Title Statistical Physics-Based Reconstruction in Compressed Sensing Abstract This talk will present a new class of "seeded" compressed sensing measurement matrices for which a message passing algorithm achieves exact reconstruction of the signal even at measurement rates very close to the optimal one. This construction is motivated by a statistical physics study of the compressed sensing problem and by the theory of crystal nucleation. (Based on joint work with Florent Krzakala, Francois Sausset, Yi-Fan Sun, and Lenka Zdeborova.) Slides [.pdf] Reading list F. Krzakala, M. Mezard, F. Sausset, Y. Sun, L. Zdeborova Statistical physics-based reconstruction in compressed sensing [arxiv] arxiv, 2011. Y. Kabashima, T. Wadayama, and T. Tanaka Statistical mechanical analysis of a typical reconstruction limit of compressed sensing [arxiv] Proc. ISIT 2010.

 Speaker Mehdi Molkaraie (ETH Zurich, Zurich, Switzerland) Title Monte Carlo Algorithms for the Partition Function of Two-Dimensional Channels Abstract We propose Monte Carlo algorithms to estimate the capacity of noiseless constrained two-dimensional channels and to estimate the information rates of noisy constrained two-dimensional source/channel models. Both problems can be reduced to computing a Monte Carlo estimate of the partition function of graphical models with cycles. The proposed algorithms use tree-based Gibbs sampling and multilayer (multitemperature) importance sampling. The viability of the algorithms is demonstrated by numerical results. (Based on joint work with Hans-Andrea Loeliger.) Slides [.pdf] Reading list M. Molkaraie and H.-A. Loeliger Monte Carlo algorithms for the partition function and information rates of two-dimensional channels [arxiv] arxiv, 2011. G. Sabato and M. Molkaraie Generalized belief propagation for the noiseless capacity and information rates of run-length limited constraints [arxiv] arxiv, 2011.

 Speaker Andrea Montanari (Stanford University, Stanford, CA, USA) Title Sharp Thresholds in Statistical Learning Abstract Sharp thresholds are ubiquitous high-dimensional combinatorial structures. The oldest example is probably the sudden emergence of the giant component in random graphs, first discovered by Erd{\H o}s and R{\'e}nyi. More recently, threshold phenomena have started to play an important role in some statistical learning and statistical signal processing problems, in part because of the interest in "compressed sensing." The basic setting is one in which a large number of noisy observations of a high-dimensional object are made. As the ratio of the number of observations to the number of "hidden dimensions" crosses a threshold, our ability to reconstruct the object increases dramatically. I will discuss several examples of this phenomenon, and some algorithmic and mathematical ideas that allow to characterize these threshold phenomena. (Based on joint work with Mohsen Bayati, David Donoho, Iain Johnstone, and Arian Maleki.) Slides [.pdf] Reading list [not available]

 Speaker Farzad Parvaresh (Hewlett-Packard Laboratories, Palo Alto, CA, USA) Title Asymptotic Enumeration of Binary Matrices with Bounded Row and Column Weights Abstract Consider the set $A_n$ of all $n \times n$ binary matrices in which the number of $1$’s in each row and column is at most $n/2$. We show that the redundancy, $n^2 - \log_2 |A_n|$, of this set equals $\rho n - \gamma \sqrt{n} + O(n^{\epsilon})$, where $\epsilon$ is a small constant greater than zero, $\rho = 1.42515\ldots$, and $\gamma$ is equal to zero for odd $n$ and is equal to $1.46016\ldots$ for even $n$. (Based on joint work with Erik Ordentlich and Ron M. Roth.) Slides [.pptx] Reading list [not available]

 Speaker David Poulin (Universite de Sherbrooke, Sherbrooke, QC, Canada) Title Belief propagation in the Quantum World Abstract This talk will present a brief overview of belief propagation in quantum mechanical setting. I will begin with a short review of the quantum formalism. Then, I will present a selection of problems that can be solved by appropriate generalizations of belief propagation. These include the problem of decoding quantum error correction codes and the estimation of partition functions of a many-body quantum system. I will also present variational duals to quantum belief propagation, and time permitting describe a quantum version of the Hammersley-Clifford theorem. Slides [.pdf] Reading list D. Poulin and M.B. Hastings Markov entropy decomposition: a variational dual for quantum belief propagation [prl] Phys. Rev. Lett, 2011. M.S. Leifer and D. Poulin Quantum graphical models and belief propagation [sciencedirect] Annals of Physics, 2008. E. Bilgin and D. Poulin Coarse-grained belief propagation for simulation of interacting quantum systems at all temperatures [aps] Phys. Rev. B, 2010.

 Speaker Moshe Schwartz (Ben-Gurion University, Beer Sheva, Israel) Title Networks of Relations in the Service of Constrained Coding Abstract Constrained systems are widely used today in communication and storage systems. A two-dimensional constrained system is the set of all two-dimensional binary arrays which obey a certain constraint. A well-known example of such constraints is the (0,1)-RLL constraint, i.e., no two adjacent zeros (horizontally or vertically) in the array. This is equivalent to counting the number of independent sets in the grid graph. When the diagonal directions are also included this is equivalent to counting arrays of non-attacking kings. The logarithm of the number of such arrays, divided by the area of the array as its sides go to infinity, is called the capacity of the constraint. We address the well-known problem of determining the capacity of two-dimensional constrained systems. While the one-dimensional case is well understood to the extent that there are techniques for rigorously deriving the exact capacity, in contrast, computing the exact capacity of a two-dimensional constrained system is still an elusive research goal. We present a rigorous technique that yields an exact capacity of a two-dimensional constrained coding system. In addition, we devise an efficient (polynomial time) algorithm for counting the exact number of constrained arrays of any given size. Our approach is a composition of a number of ideas and techniques: describing the capacity problem as a solution to a counting problem in networks of relations, graph-theoretic tools originally developed in the field of statistical mechanics, techniques for efficiently simulating quantum circuits, as well as ideas from the theory related to the spectral distribution of Toeplitz matrices. Using our technique we derive a closed form solution to the capacity related to the Path-Cover constraint in a two-dimensional triangular array (assigning 1's and 0's to the edges of the triangular grid such that every vertex is incident to either one or two 1's). Path-Cover is a generalization of the well known one-dimensional (0,1)-RLL constraint. The talk is self-contained, and a brief survey of constrained coding will be given, as well as a few open problems motivated by storage systems. (Based on joint work with Jehoshua Bruck.) Slides [.pdf] Reading list P.W. Kasteleyn The statistics of dimers on a lattice. I. The number of dimer arrangements on a quadratic lattice [sciencedirect] Physica, 1961. L.G. Valiant Holographic algorithms [ieeecomputersociety] Proc. FOCS 2004.

 Speaker Jinwoo Shin (Georgia Institute of Technology, Atlanta, GA, USA) Title Improved Mixing Condition on the Grid for Counting and Sampling Independent Sets Abstract We study the hard-core model defined on independent sets, where each independent set $I$ in a graph $G$ is weighted proportionally to $\lambda^{|I|}$, for a positive real parameter $\lambda$. For large $\lambda$, computing the partition function (namely, the normalizing constant which makes the weighting a probability distribution on a finite graph) on graphs of maximum degree $D >2$, is a well known computationally challenging problem. More concretely, let $\lambda_c(T_D)$ denote the critical value for the so-called uniqueness threshold of the hard-core model on the infinite $D$-regular tree; recent breakthrough results of Dror Weitz (2006) and Allan Sly (2010) have identified $\lambda_c(T_D)$ as a threshold where the hardness of estimating the above partition function undergoes a computational transition. We focus on the well-studied particular case of the square lattice, and provide a new lower bound for the uniqueness threshold, in particular taking it well above $\lambda_c(T_4)$. Our technique refines and builds on the tree of self-avoiding walks approach of Weitz, resulting in a new technical sufficient criterion (of wider applicability) for establishing strong spatial mixing (and hence uniqueness) for the hard-core model. Our new criterion achieves better bounds on strong spatial mixing when the graph has extra structure, improving upon what can be achieved by just using the maximum degree. Applying our technique to the square lattice, we prove that strong spatial mixing holds for all $\lambda < 2.3882$, improving upon the work of Weitz that held for $\lambda < 1.6875$. Our results imply a fully-polynomial deterministic approximation algorithm for estimating the partition function, as well as rapid mixing of the associated Glauber dynamics to sample from the hard-core distribution. (Based on joint work with Ricardo Restrepo, Prasad Tetali, Eric Vigoda, and Linji Yang.) Slides [.pdf] Reading list D. Weitz Counting independent sets up to the tree threshold [.pdf] Proc. STOC 2006. R. Restrepo, J. Shin, P. Tetali, E. Vigoda, and L. Yang Improved mixing condition on the grid for counting and sampling independent sets [arxiv] arxiv, 2011.

 Speaker Alistair Sinclair (University of California, Berkeley, CA, USA) Title Permanents, Determinants, and Commutativity Abstract The permanent and determinant of a matrix, while syntactically very similar, are computationally very different: as is well known, there are a number of efficient algorithms for computing the determinant, but the permanent is #P-hard and hence presumably intractable. It is a long-standing challenge in theoretical computer science to explain this phenomenon. In this talk we focus on one aspect of this question, namely the role played by commutativity of the matrix entries. (Known algorithms for computing determinants all apparently rely crucially on commutativity.) On the one hand, we will show that the ability to compute the determinant over certain non-commutative algebras would lead to a very simple and efficient approximation algorithm for the permanent. This serves as motivation for the second part of the talk, in which we will show that computing the determinant over "most" non-commutative algebras is as hard as computing the permanent exactly. In fact, we will give a dichotomy theorem for efficient computation of the determinant over non-commutative algebras in terms of a structural property of the algebras. (Based on joint work with Steve Chien, Praladh Harsha, Lars Rasmussen, and Srikanth Srinivasan.) Slides [not available yet] Reading list S. Chien, P. Harsha, A. Sinclair, and S. Srinivasan Almost settling the hardness of noncommutative determinant [arxiv] Proc. STOC 2011. S. Chien, L. Rasmussen, and A. Sinclair Clifford algebras and approximating the permanent [sciencedirect] J. Computer and Systems Sciences, 2003

 Speaker Leslie G. Valiant (Harvard University, Cambridge, MA, USA) Title Holographic Algorithms Abstract We will describe the basic ideas behind holographic algorithms, and how these are used for deriving both positive and negative results about the computational complexity of counting problems. We shall also discuss some background in computational complexity theory that motivates this approach. Slides [not available yet] Reading list L.G. Valiant Holographic Algorithms [.pdf] Proc. STOC 2004. L.G. Valiant Accidental Algorithms [.pdf] Proc. FOCS 2006. J.-Y. Cai and P. Lu Holographic algorithms: from art to science [acm] Proc. STOC 2007.

 Speaker Umesh Vazirani (University of California, Berkeley, CA, USA) Title Quantum Description Complexity Abstract The classical description of $n$-particle quantum systems scales exponentially in $n$. Are there general conditions under which such states can be succinctly described? Prime candidates for this are the ground states of gapped local Hamiltonians. The primary cause for optimism is the long conjectured area law, which bounds the amount of entanglement in such states. I will describe recent progress towards proving the area law, as well as tensor network based techniques for succinctly describing quantum states with limited entanglement. Slides [not available yet] Reading list [not available]

 Speaker Pascal O. Vontobel (Hewlett-Packard Laboratories, Palo Alto, CA, USA) Title Should We Believe in Numbers Computed by Loopy Belief Propagation? Abstract Loopy belief propagation (LBP) has been very popular in the last fifteen years in the area of coded data transmission (and beyond) because of its low implementation complexity and its outstanding performance. In this talk, we discuss an analysis technique that allows one to obtain a better understanding of why LBP works so well for some problems, yet also has some limitations. Although this LBP analysis technique is broadly applicable, to be concrete, we present it in a specific context. Namely, we focus on the use of LBP for estimating the sum of weighted perfect matchings in a complete bipartite graph (i.e., permanents), a setup that subsumes many interesting counting problems. Common wisdom would suggest that LBP does not work well in this context because the underlying graph is dense and has many short cycles. However, it turns out that LBP gives a very valuable estimate for this counting problem, and the above-mentioned analysis technique allows us to see why this is so by providing a combinatorial characterization of the LBP-based estimate: on the one hand, this combinatorial characterization gives insights why the LBP-based estimate is useful, on the other hand, it shows why the LBP-based estimate differs in general from the correct value. Slides [.pdf] Reading list P.O. Vontobel The Bethe permanent of a non-negative matrix [arxiv] arxiv, 2011. P.O. Vontobel Counting in graph covers: A combinatorial characterization of the Bethe entropy function [arxiv] arxiv, 2010. L. Gurvits Unharnessing the power of Schrijver's permanental inequality [arixv] arxiv, June 2011.

 Speaker Martin J. Wainwright (University of California, Berkeley, CA, USA) Title Learning in Graphical Models: Missing Data and Rigorous Guarantees with Non-Convexity Abstract Given a black box that generates samples from a graphical model, how to efficiently estimate the structure and parameters of the underlying graph? This learning problem underlies many applications of graphical models, and has attracted signficant attention in recent years. As we discuss, when samples are fully observed without corruption, there are now a number of neighborhood-based algorithms with attractive practical performance and theoretical guarantees. We then turn to problems with noisy or partially missing data, and propose a modified neighborhood-based approach that involves solving non-convex programs. On the statistical side, we provide non-asymptotic bounds that hold with high probability for the cases of noisy, missing, and/or dependent data. On the computational side, we prove that under the same types of conditions required for statistical consistency, a simple projected gradient descent algorithm is guaranteed to converge at a geometric rate to a near-global minimizer. This yields a polynomial-time algorithm that matches information-theoretic lower bounds up to constant factors. (Based on joint work with Po-Ling Loh.) Slides [.pdf] Reading list [not available]

 Speaker Yair Weiss (Hebrew University, Jerusalem, Israel) Title Convexity: what is it good for? Abstract Perhaps the most exciting recent development in message-passing algorithms on graphs is the discovery of algorithms such as tree-reweighted message passing (TRW) which are based on {\em convex} approximations to the free energy. Both the sum-product version and the max-product version of these algorithms have highly desirable theoretical properties and a fascinating connection to linear programming relaxations. Despite these properties, in many applications practitioners still prefer using the "classical" BP algorithm which uses a {\em non-convex} approximation to the free energy. I will illustrate this disconnect between theory and practice on a number of applications from my lab, and also discuss some recent theoretical work that tries to adapt some of the results on convex algorithms to the classical BP algorithm. Slides [.pdf] Reading list [not available]

 Speaker Jonathan Yedidia (Disney Research, Cambridge, MA, USA) Title The Alternating Direction Method of Multipliers as a Message-Passing Algorithm Abstract I will describe how the alternating direction method of multipliers (ADMM) can be used to construct a simple message-passing algorithm to search for the optimal configuration on any factor graph, where the factor graph encodes an optimization problem with continuous and/or discrete variables and with arbitrary cost functions. The algorithm provably converges to the correct answer for all problems where the factor graph has only convex cost functions, and is a powerful heuristic for factor graphs with non-convex costs. For the special case when the cost functions are all hard constraints, the algorithm reduces to the "Divide and Concur" (DC) algorithm. ADMM and DC algorithms use similar memory and data structures compared to belief propagation (BP) algorithms, but they have the important advantage compared to BP that they deal efficiently with continuous variables, and for DC algorithms, one can guarantee that all fixed points of the dynamics are correct solutions, even for non-convex problems. On the other hand ADMM and DC algorithms are not inherently scale-invariant, which is related to their slow convergence for some problems. Slides [.pdf] Reading list J. Yedidia Message-passing algorithms for inference and optimization: "Belief Propagation" and "Divide and Concur" [.pdf] subm. to Journal of Statistical Physics, 2011. V. Elser, I. Rankenburg, and P. Thibault Searching with iterated maps [pnas] PNAS 2007. S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein Distributed optimization and statistical learning via the alternating direction method of multipliers [html] Foundations and Trends in Machine Learning, 2011.