\documentclass{article}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsmath}
\usepackage{latexsym}
\usepackage{graphicx}
\setlength{\oddsidemargin}{.25in}
\setlength{\evensidemargin}{.25in}
\setlength{\textwidth}{6in}
\setlength{\topmargin}{-0.5in}
\setlength{\textheight}{8.5in}

\newcommand{\header}{
   \renewcommand{\thepage}{\arabic{page}}
   \noindent
   \begin{center}
   \framebox{
      \vbox{
    \hbox to 5.78in {\bf 6.841/18.405J: Advanced Complexity \hfill Monday, April 7th, 2003}
       \vspace{2mm}
       \hbox to 5.78in { {\Large \hfill Lecture 15\hfill} }
       \vspace{2mm}
       \hbox to 5.78in { {\it Instructor: Madhu Sudan \hfill Scribe: Mike Neiman} }
      }
   }
   \end{center}
   \vspace*{3mm}
}

\newtheorem{theorem}{Theorem}
\newtheorem{claim}{Claim}
\newtheorem{lemma}{Lemma}

\begin{document}

\header
In this lecture, we will cover the following topics:

\begin{itemize}
\item PCPs and Inapproximability
\item Approximability
\item Average-Case Complexity
\end{itemize}

\section{Probabilistically Checkable Proofs and Inapproximability}
\subsection{Probabilistically Checkable Proofs (PCPs)}
We begin by recalling the definition of PCPs that was given in the last lecture.
A $(r,q)$-restricted verifier $V$ is a probabilistic polynomial time oracle machine that, given input $x$, uses at most $r(|x|)$ coin tosses and queries the oracle in at most $q(|x|)$ bits.
A language $L \in$ PCP$[r,q]$ if $\exists$ $(r,q)$-restricted verifier $V$ such that:
$$x \in L \Rightarrow \exists \Pi \mbox{ } Pr[V^{\Pi}(x) \mbox{ accepts}] = 1$$
$$x \notin L \Rightarrow \forall \Pi \mbox{ } Pr[V^{\Pi}(x) \mbox{ accepts}] \leq \frac{1}{2}$$
\\Let $n$ be the length of the input. We have the following results:

\begin{itemize}
\item{PCP[$0$,poly($n$)]$ = $NP}
\item{PCP[poly($n$),$0$]$ = $co-RP}
\item{PCP[poly($n$),poly($n$)]$ = $NEXPTIME}
\item{PCP[$O(log(n))$,$O(1)$]$ = $NP}
\item{PCP[$O(log(n))$,$3$]$ = $NP}
\end{itemize}
The first two results follow immediately from the definitions of the complexity classes, but the remaining results are non-trivial to prove.
We won't prove any of these results here.
References to proofs in the literature were given in the previous lecture.

\subsection{Inapproximability of 3SAT}
Let's take a closer look at the result that PCP[$O(log(n))$,$3$]$ = $NP.
Let $V$ be a $(r,3)$-restricted verifier for language $L\in$NP and let $\Pi$ be an optimal proof (a proof with maximum accepting probability) for $V$, where $r = O(log(n))$.
For each random string, $V$ performs a "test" based on $3$ bits of the proof $\Pi$.
For example:
$$00 \cdots 00 \longrightarrow \Pi_{1} = 1 \Rightarrow \Pi_{2} = \Pi_{5}$$
$$00 \cdots 01 \longrightarrow \Pi_{2} = 1 \Rightarrow \Pi_{3} = 1 \mbox{ or } \Pi_{4} = 0$$
Each test can be converted to a 3CNF formula with $8$ clauses on at most $7$ variables.
For each random string $s$, this gives a $8$-clause 3CNF $\phi_{s}$.
We now set
$$\Phi_{x} = \bigwedge_{s \in \{ 0,1 \}^{r}} \phi_{s}$$
Note that $\Phi_{x}$ is a 3CNF formula with $8 \cdot 2^r$ clauses that possesses the following property:
$$x \in L \Rightarrow \Phi_{x} \mbox{ is satisfiable}$$
$$x \notin L \Rightarrow \mbox{ any assignment to } \Phi_{x} \mbox{ violates at least } \frac{1}{16} \mbox{th of all clauses}$$

Now, suppose there exists an algorithm $A$ running in polynomial time that, given a 3CNF formula $\phi$, can satisfy $(\frac{15}{16} + \epsilon) \cdot m$ clauses, where $m$ is the maximum number of satisfiable clauses in $\phi$.
Let $L$ be any NP-complete language.
Using the above procedure, we can construct $\Phi_{x}$, input this to $A$, and count the number of satisfied clauses.
Consider the result and what it implies:

\begin{itemize}
\item{If more than $\frac{15}{16}$ clauses are satisfied, then $x \in L$.}
\item{If not more than $\frac{15}{16}$ clauses are satisfied, then $x \notin L$.}
\end{itemize}
We conclude that the existence of algorithm $A$ leads to P=NP.
Assuming (as usual) that P$\neq$NP, this result shows that it is impossible to approximate the number of satisfiable clauses by a factor of $(\frac{15}{16} + \epsilon)$.

\section{Approximability}
\subsection{Combinatorial Optimization}
Many important problems that arise in combinatorial optimization can be described by a triple
$$\Pi = (\mbox{sol?, obj, opt})$$
where sol?:$\{ 0,1 \}^{*} \times \{ 0,1 \}^{*} \rightarrow \{ 0,1 \}$ takes an instance of the problem and a proposed solution and determines whether the solution is valid, obj:$\{ 0,1 \}^{*} \times \{ 0,1 \}^{*} \rightarrow \mathbb{R}^{\geq 0}$ evaluates the objective function given a problem instance and solution, and opt$\in \{ \mbox{min,max} \}$ specifies whether the objective function is to be minimized or maximized.
We also require that sol? and obj be computable in polynomial time.
As an example of such a problem, consider the traveling salesman problem (TSP). The instances of the problem are weighted graphs. The solutions are subgraphs.
We have:
$$\mbox{sol?}(G,G') = 1 \mbox{ if } G' \mbox{ is a cycle spanning all vertices and is contained in } G$$
$$\mbox{obj}(G,G') = \mbox{ sum of weights of edges of } G'$$
$$\mbox{opt} = \mbox{min}$$
Other examples would be determining the size of the largest clique in a graph or determining the minimum number of colors required to color a graph.

\subsection{Heuristics for NP-complete Problems}
The difficulty of NP-complete problems has led to the use of many heuristic algorithms.
Of course, in order for these algorithms to be efficient, they can only be approximate.
Typically, the results of a heuristic algorithm will be like: "Found solution that was $99\%$ close to optimum in $95\%$ of cases." Here, the $99\%$ refers to approximability and the $95\%$ refers to average-case behavior.
We will now take a closer look at theses issues.

\subsection{Approximability}
When considering approximability, problems are of the form $(\Pi, \alpha)$, where $\Pi$ is a standard combinatorial optimization problem and $\alpha : \mathbb{Z}^{+} \rightarrow \mathbb{R}^{\geq 1}$ is an approximation parameter.
Our goal is to find an algorithm $A$ that, given $x$ (instance of $\Pi$), computes a solution $A(x)$ satisfying:
$$\frac{opt}{\alpha(|x|)} \leq \mbox{obj} (x,A(x)) \leq opt \cdot \alpha(|x|)$$
where $opt$ is the optimum value.
Notice that the problem $(\Pi, 1)$ is equivalent to the original problem $\Pi$.
From the theory of Np-completeness, we know that the three problems (Bin Packing, 1), (MaxSAT, 1), and (Coloring, 1) can each be reduced to one another. However, this does not imply that the three problems (Bin Packing, 2), (MaxSAT, 2), and (Coloring, 2) are equally difficult algorithmically.
Here are a few results:

\begin{itemize}
\item{(MaxSAT, $\frac{16}{15} - \epsilon$) is NP-hard}
\item{(MaxSAT, 2)$\in$P}
\item{(Coloring, $\frac{4}{3} - \epsilon$) is NP-hard}
\item{(Clique, 2) is NP-hard}
\end{itemize}

The first result was proved in section 1.2, and the second follows immediately from the fact that either the assignment of all zeros or the assignment of all ones must satisfy at least half of all clauses. The third result is a direct consequence of the NP-hardness of determining 3-colorability. The last result will be left as an exercise in a future problem set.

\section{Average-Case Complexity}
Here we consider problems of the form $(\Pi, D)$, where $\Pi$ is a standard problem and $D$ is a distribution on instances. For example, $D$ may be the uniform distribution on all $n$-bit strings.
Suppose that we have an algorithm $A$ that satisfies the following:

\begin{itemize}
\item{on inputs not in $S$, $A$ runs in time $n^2$}
\item{on inputs in $S$, $A$ runs in time $4^n$}
\end{itemize}
where $S \subseteq \{ 0,1 \}^{n}$ and $|S| < 2^{\sqrt{n}}$.
We would consider the problem well-solved in practice by $A$ for uniformly distributed instances.

We now give the definition of the complexity class Avg-P, which is the average-case analog of P.
We say that $(\Pi, D) \in$Avg-P if there exists algorithm $A(\cdot, \cdot)$ solving $\Pi$ such that:
$$\forall \delta \mbox{ } Pr_{x \leftarrow D} [A(x,\delta) \mbox{ solves } \Pi \mbox{ in time poly} (|x|, \frac{1}{\delta})] \geq 1 - \delta$$
We will consider average-case complexity and Avg-P in more detail in the next lecture.

\end{document}