\documentclass[11pt]{article}
\usepackage{amsfonts, amsmath}
\newtheorem{define}{Definition}

%\newcommand{\Z}{{\bf Z}}
%\usepackage{psfig}

\oddsidemargin=0.15in
\evensidemargin=0.15in
\topmargin=-.5in
\textheight=9in
\textwidth=6.25in

\newcommand{\PTime}{\ensuremath{\operatorname{P}} }
\newcommand{\sharpP}{\ensuremath{\operatorname{\#P}} }
\newcommand{\NP}{\ensuremath{\operatorname{NP}} }
\newcommand{\BPP}{\ensuremath{\operatorname{BPP}} }
\newcommand{\perm}{\ensuremath{\operatorname{perm}} }
\newcommand{\PH}{\ensuremath{\operatorname{PH}} }
\newcommand{\BP}{\ensuremath{\operatorname{BP}} }
\begin{document}
\input{preamble.tex}

\lecture{13}{March 16, 2005}{Madhu Sudan}{Victor Chen}

\section{Overview}
Today we study the complexity of counting problems and define the classes \sharpP, $\PTime^{\sharpP}$.
In particular, we will examine the permanent of a matrix and its relation to $\sharpP$-completeness and 
random instances; this is quite a remarkable problem with a central position in complexity theory.
And finally, we will start Toda's landmark theorem, which states that $PH \subset \PTime^{\sharpP}$.

A good reference for today's lecture is Chapter $18$ in Papadimitriou's textbook.

\section{\sharpP}
There are some natural questions that come up with nondeterminism.
One is searching; given a problem, find a solution.
Another is optimization - find the best solution. 
There is a third notion, namely counting how many solutions there are to a given problem, which we have yet to explore.
To do so, let us first define $\sharpP$. 
$\sharpP$ is defined to be the class of counting functions $f:\Sigma^* \rightarrow \mathbf{Z}$, such that 
there exists a polynomial time bounded machine $M$ such that $\forall x$, 
$f(x)=|\{y|M(x,y) \mbox{ accepts} \}|.$

Some examples include \#SAT, counting the number of satisfying assignments in a formula, and \#CLIQUE, counting the number of cliques in a graph. As an exercise, note that the problem of counting the number of cliques in a graph reduces to the problem of counting the number of cliques at least a specified size.

\subsection{PERMANENT}
One of the most interesting problems in $\sharpP$ is the problem of counting the number of matchings in a graph, which 
is equivalent to finding the permanent of a matrix. (We will show this equivalence very shortly).
Recall that a (perfect) matching in a graph $G$ on $n$ vertices is a subgraph $H$ of degree exactly one at each vertex.
Clearly MATCHING, the problem of deciding whether a given graph has a matching, is in $\NP$. It is not obvious how to show this is in coNP. Tutte in the $40s$ proved many fundamental results on matching, and Edmonds in the $60s$ finally showed that MATCHING is in $\PTime$. Incidentally, it was this work that led Edmonds to define polynomial time to capture our notion of efficiency. The counting version of MATCHING, \#MATCHING, is simply counting the number of matchings in a graph. Note that unlike \#SAT or \#CLIQUE, the decision version of \#MATCHING is easy yet the counting version is hard.

Clearly \#MATCHING $\in \sharpP$. Now we show the equivalence between the number of matchings and the permanent of a matrix. Consider a bipartite graph and its associated adjacency matrix $A=\{a_{ij}\}$.
It is easy to see that a matching in an $n$ by $n$ bipartite graph is given by a permutation $\pi:[n] \rightarrow [n]$.
The edges in a matching simply match up each vertex on the left with a unique vertex on the right.
Then the number of matchings in a graph is simply
$$\sum_{\pi \in S_n} \prod_{j=1}^n a_{j\pi(j)} = \perm(A).$$

In other words, \#MATCHING=PERMANENT.
Syntactically, this looks like the definition of the determinant of a matrix $A$. 
Recall that 
$$\det(A)= \sum_{\pi \in S_n} (-1)^{\mbox{sign}(\pi)} \prod_{j=1}^n a_{j\pi(j)},$$

where $\mbox{sign}(\pi)$ is $1$ if the number of swaps needed to sort $\pi$ into identity is odd and $0$ otherwise.
Determinant is also a counting function, and we know that it can be computed efficiently. 
So perhaps the similarity in the algebraic definitions of the perm and the determinant suggest that we may compute the perm by using the determinant somehow. 
Indeed, one of the earliest approaches in trying to compute $\perm(A)$ involved making an appropriate transformation on $A$ to some other matrix $A'$ with $0, \pm$ entries, such that $\perm(A)=\det(A')$.
However, it can be shown that this approach does not work since the permanent can be as large as $n!$ whereas the determinant of a $0,\pm$ matrix can be at most $2n$. Nonetheless, this approach works when the graph associated with the matrix is planar.

\subsection{$\sharpP$-completeness}
Valiant showed that PERMANENT is $\sharpP$-complete. We will not show this in class. Since a self-contained proof is in Papadimitriou's book, we will briefly make some remarks.

To prove completeness, we need a reduction that takes a function $f$ and instance $x$ and outputs $A_x$ such that $f(x)$ can be computed efficiently from $\perm(A_x)$. The key observation is that most reductions we have seen in complexity so far are parsimonious, that is, the reduction preserves the number of solutions. 
Most importantly, Cook's reduction is parsimonious.
In other words, if the reduction takes $x$ and computes $\phi$, then the number of satisfying assignment to $\phi$ is $f(x) k(|\phi|)$, where $k(\cdot)$ is an easily computable function.
Note that the function $k$ depends not on $\phi$ but just on its length.
To show that \#SAT reduces to PERMANENT, Valiant constructed appropriate gadgets such that if the reduction takes $\phi$ to $A$, then $\perm(A)=\#\mbox{SAT}(\phi) + g(\phi)2^{n^2}$.
Note that $\#\mbox{SAT}(\phi) \leq 2^n$, so we can compute $\perm(A)$ modulo $2^{n^2}$.
This is sometimes called an affine parsimonious reduction.

\subsection{Some Remarks}
In complexity, it is much easier to work with languages than functions. 
We would like to define languages for $\sharpP$.
However, this is not as simple as SAT.
Instead, we shall resort to letting $\PTime^{\sharpP}$ capture our notion of languages.

Historical Notes: The complexity of counting led Valiant to define an algebraic version of $\NP$.
And the problem of approximate counting has developed into a beautiful area in its own right and sampling techniques.

\section{Random Instances}
Looking at the algebraic formalization of the permanent, there should be no reason that any particular instance should be easy to compute. In fact, we have the following result due to Lipton.

\begin{theorem}
Suppose there exists an expected polynomial time algorithm that computes $\perm(A)$ mod $p$ for a random prime $p \in [2^n]$ and a random $n$ by $n$ matrix $A \in Z_p^{n \times n}$.
Then there exists a randomized polynomial time algorithm that computes the permanent.
\end{theorem}

Note that the randomness of the algorithm in the antecedent of the theorem is over $A$ and $p$, whereas the randomness of the second algorithm is over its internal coins. The typical classification of problems into complexity classes depends on worst case analysis, which does not always correspond to how algorithms are used in practice. 
However, here Lipton's result makes a powerful statement; random instances of the permanent is as hard as worst case instances. We now proceed to give a proof.\\


\begin{proof}
Suppose we treat the entries of an $n$ by $n$ matrix $M$ as indeterminates.
Then $\perm(M)=\sum_{\pi} \prod_{i=1}^n X_{i,\pi(i)}$, which is a function of $n^2$ variables with degree $n$.
Say we wish to compute $\perm(M)$ with permanent $\leq 2^{\sqrt{n}}$, e.g., when $M$ has dimension $\leq 2^{\sqrt{n}}$.
Pick a random prime $p \in [2^n]$. 
With high probability, $p \geq 2^n/n^2$.
Pick a random $R \in Z_p^{n \times n}$.
Define $g(t)=\perm(M+tR)$, which becomes a function in variable, namely $t$, with degree at most $n$.
We want to compute $\perm(M)=g(0)$.
This can be done if we know all the coefficients of $g$.
Since $g$ has degree at most $n$, we will compute $g(i)$ for $i=1,\dots, n$, which we shall see is
easier to compute. And thus, these $n+1$ points will define $g$.

Since we are working over $Z_p$, for $i=1,\ldots, n$, $iR$ is a random matrix iff $R$ is random. 
So $M+iR$ is a random matrix as well. So $g(i)=\perm(M+iR)$ is the permanent of a random matrix. 

By assumption, a random instance of the permanent is easy to compute.
In particular, suppose we are given an algorithm $B$ such that $B(A,p)$ computes $\perm(A) \mbox{ mod } p$ in expected $n^3$ time, for random $A$ and $p$.
Then by our assumption, $B(A)$ must compute $\perm(A) \mbox{ mod } p$ in at most $n^{10}$ time for all but $n^{-5}$ fraction of $(A,p)$.
So by the Union bound, with probability $1-n^{-4}$, we have the correct values of $g(1),\ldots,g(n)$.
\end{proof}

\section{Toda's Theorem}
In a complete surprise in 1989, Toda showed that $\PH \subset \PTime^{\sharpP}$.
We give a quick introduction and save the proof until after spring break.
Toda defined operators like $\BP$, $\oplus$, and $\PTime$ that act on complexity classes.
These operators essentially capture questions like is there a majority of instances in the language with the following properties. In particular, $L \in \oplus \cdot C$ if there exists $M$, $L(M) \in C$, such that 
$$x \in L \mbox{ iff } |\{y | M(x,y) \mbox{ accepts}\}| \mbox{ is odd.}$$

This allows us to ask for the least significant bit, which is a much richer notion than the most significant bit (since we can approximate the permanent).
\footnote{As an exercise, show that approximating $\sharpP$ to a multiplicative factor can be done in $\PH$.
Very recently, Jerrum, Sinclair, and Vigola showed in 2001 that approximating the permanent can be done with a $\BPP$ algorithm.}

Also, we can define $L \in \BP \cdot C$ if there exists $M$, $L(M) \in C$, such that 
$x \in L$ implies $\Pr[M(x,y) \mbox{ accepts}] \geq 2/3$, and 
$x \not in L$ implies $\Pr[M(x,y) \mbox{ accepts}] \leq 1/3$.

We will define more operators next time and what they mean next time.
In particular, we will show $\BP \oplus \PTime \subset \PTime^{\sharpP}$ and 
$\Sigma_i^p \subset \BP \oplus \PTime$. 
The latter can be shown by an inductive argument roughly as follows.
\begin{eqnarray*}
\Sigma_i^p & \subset & \Sigma \cdot \BP \oplus \PTime, \mbox{ by induction} \\
 					 & \subset & \BP \oplus \BP \oplus \PTime \mbox{ base case} \\
 					 & \subset & \BP \BP \oplus \oplus \PTime \mbox{ by commuting the operators} \\
 					 & \subset & \BP \oplus \PTime \mbox{ by collapsing the operators.} 	
\end{eqnarray*}

The magic of these operators will be fully explored next time.


\end{document}
