
\documentclass[10pt]{article}
\newtheorem{define}{Definition}
%\newcommand{\Z}{{\mathbb{Z}}}
\usepackage{psfig}
\usepackage{epsfig}
\usepackage{amssymb}

\oddsidemargin=0.15in \evensidemargin=0.15in \topmargin=-.5in
\textheight=9in \textwidth=6.25in

\newcommand{\BI}{\begin{itemize}}
\newcommand{\EI}{\end{itemize}}
\newcommand{\BE}{\begin{enumerate}}
\newcommand{\EE}{\end{enumerate}}
\def\NP{{\bf NP} }
\def\P{{\bf P} }
\newcommand{\Time}{\rm{Time}}

\begin{document}
\input{preamble.tex}

\lecture{2}{February 7, 2005}{Madhu Sudan}{Kyomin Jung}

\section{Overview}
In this lecture we will discuss some approaches and results
concerning the problem $\P\ne \NP$, that use diagonalization
arguments\footnote{See Lance Fortnow's survey paper on
diagonalization as a reference}. Baker-Gill-Solovay relativization
shows a limitation of diagonalization method for showing $\P\ne\NP$.
And Lander's theorem says about the hierarchy of intermediate
language classes between $\P$ and $\NP$ under the assumption that
$\P\ne\NP$. Then we will introduce the circuit complexity of a
language, which we will discuss more in detail in the next lecture.

\section{Diagonalization}
Diagonalization argument, which was first used by Cantor when he
showed that there is no one to one correspondence between
$\mathbb{N}$ and $\mathbb{R}$, is an important tool when we show
that for classes of languages $C_1$ and $C_2$ that are enumerable,
$C_1$ is strictly contained within $C_2$. Let
$$C_1=<L_1,L_2,L_3,\ldots>$$ where each languages in $C_1$ appears at least
once in the above sequence. Let $x_1 , x_2, x_3 \ldots$ be a
sequence of all the finite binary strings. We can get this sequence
because the set of all the finite binary strings is enumerable. Then
each language $L_i$ can be thought as an infinite string
$s_{i1}s_{i2}s_{i3}\ldots$ where $s_{ij}=1$ iff $x_j\in L_i$ and
otherwise $s_{ij}=0$. Now think of an infinite string
$L'=\bar{s_{11}}\bar{s_{22}}\bar{s_{33}}\ldots$, where
$\bar{s_{ii}}$ is the negation of $s_{ii}$. Then $L'$ is not in
$C_1$ because $L'$ differs from $L_i$ in their $i^{th}$ bits. So if
we show that $L'$ is in $C_2$(this part differs according to the
specific problem), we obtain that $C_1$ is strictly contained within
$C_2$. This type of argument is called diagonilization and usually
this kind of argument can be applied to obtain some results like
$\Time(n^2)$ is strictly contained within $\Time(n^3)$.


\section{Baker-Gill-Solovay Relativization}
For a complexity class of languages, $C$, which is characterized by
some limit on some resources(for example, $\P$), we can think of a
``relativized'' class of languages with respect to a given language
$A$, that is, machines in this class are allowed to make oracle
calls to $A$ for a unit cost.
\begin{definition} $\P^A$, the relativization of $\P$ with respect to $A$, is the set of all languages decidable by
polynomial time machines with oracle calls to $A$.\end{definition}
\begin{definition} $\NP^A$, the relativization of $\NP$ with respect to $A$, is the set of all languages decidable by
polynomial time machines with oracle calls to $A$.\end{definition}

If diagonalization produces a language $L'$ in $C_2$ but not in
$C_1$, then it can be seen that for every language $A$, $C_1^A$ is
strictly contained in $C_2^A$ using $L'$. With this fact in mind,
next theorem due to Baker-Gill-Solovay shows a limitation of
diagonalization arguments for proving $\P\ne\NP$.
\begin{theorem} (Baker-Gill-Solovay) There exist oracles $A$ and $B$
such that $$\P^A=\NP^A$$ $$\P^B\ne\NP^B.$$
\end{theorem}
Here we will only prove the first part of the above theorem. To this
end, take the language $A$ to be a PSPACE-complete language, for
example TQBF. Then
$$\NP^A\subseteq \rm{NPSPACE}=\rm{PSPACE} \subseteq \P^A \subseteq
\NP^A.$$ So, from this results we can see that diagonilization
argument cannot help to prove $\P\ne \NP$.


\section{Ladner's Theorem}
Some examples of $\NP$ problems that are not known to be in $\P$ or
$\NP$-complete are Graph isomorphism problem and Integer Factoring
problem. Although it was not known whether Primality testing is in
$\P$ or not until some years ago, recently it was proved that
Primality testing is in $\P$. Althought currently we do not know
whether $\P\ne \NP$ or not, above problems are usually thought to be
as candidates of languages that would be in between $\P$ and
$\NP$-complete. Now we will state and prove a result due to Lander
that says that if $\P\ne \NP$, there must be some(in fact,
infinitely many) intermediate language classes between $\P$ and
$\NP$.

\begin{theorem}(Lander)
If $P\ne\NP$, then there exists $L\in \NP$ such that $L$ is not
$\NP$-hard and $L$ is not in $\P$.
\end{theorem}

More general version of Lander's theorem, which we will not deal in
this lecture, says that

\begin{theorem}\label{L}
If $C_1\le_P C_2$ holds but $C_2\le_P C_1$ does not holds, then
there exists $C_3$ and $C_4$ such that $C_1\le_P C_3\le_P C_2$ and
$C_1\le_P C_4\le_P C_2$ hold but none of $C_2\le_P C_3$, $C_2\le_P
C_4$, $C_3\le_P C_1$, $C_4\le_P C_1$, $C_3\le_P C_4$, $C_4\le_P C_3$
holds.
\end{theorem}
Theorem \ref{L} shows that if $\P\ne\NP$, then there must be
infinitely many intermediate language classes between $\P$ and $\NP$
up to polynomial time reduction. Now to show the simpler version of
the Lander's theorem, we will explicitly define a language $L$ that
is in $\NP$, but not in $\P$ nor a $\NP$-complete.

Let $f:\mathbb{N}\rightarrow\mathbb{N}$ be a polynomial time
computable function which we will define later. Then we will define
$$L=\{n\ |\ n\in \mbox{SAT and } f(|n|) \mbox{ is
even}\}.$$ Intuitively, when $f(n)$ is even $L(n)$ is copied from
SAT$(n)$(here SAT can be replaced by any $\NP$-complete language),
and when $f(n)$ is odd $L(n)$ is copied from the language
$000000\ldots$ that is in $\P$.


We will define $f$ to be an increasing function so that $L$ is in
$\NP$ but it is different from any of languages in $\P$, and
different from any of $\NP$-complete languages. Let $M_1, M_2,
M_3\ldots$ be a sequence of al the polynomial time deterministic
Turing Machines (allowing that some machines may appear more than
once) so that $M_i(x)$ runs in time $|x|^i$. Similarly let $g_1,
g_2, g_3\ldots$ be a sequence of all the polynomial time computable
reductions so that $g_i(x)$ runs in time $|x|^i$. Inductively,
define $f(n)$ by\BE
\item When $f(n-1)=2j$ for some integer $j$.
$f(n)=f(n-1)+1$ if there exist $x\le \log \log n$ such that
$M_j(x)\ne L(x)$. Otherwise $f(n)=f(n-1)$. I.e., $f$ stays at the
same value until $L$ is sure to be different from $M_j$ at some
sufficiently small $x$.
\item When $f(n-1)=2j+1$ for some integer $j$. $f(n)=f(n-1)+1$ if
there exist $x\le \log \log n$ such that $\rm{SAT}(x)\ne L(g_j(x))$.
Otherwise $f(n)=f(n-1)$. I.e., $f$ stays at the same value until it
becomes to be sure that SAT is not polynomial time reducible to $L$
via reduction $g_i$. \EE

Now we will show that $f$ tends to infinity. Suppose that $f$ is
stuck at some even number as $n$ goes to infinity. Then by the
definition of $L$, $L$=SAT except finitely many inputs, and $L=M_j$
for some $j$. A contradiction to the assumption that SAT is not in
$\P$. Similarly, if $f$ is stuck at some odd number, then $L$ is in
$\P$, and SAT is reducible to $L$ via some reduction $g_j$, saying
that SAT is in $\P$, again a contradiction to the assumption. So $f$
tends to infinity as $n$ goes to infinity.

Then by the definition of $f$, we obtain that $L$ is in $\NP$(note
that here we need the fact that SAT is in $\NP$). And because $L$ is
different from any of $M_j$, $L$ is not in $\P$. And because SAT is
not polynomial time reducible to $L$, $L$ is not $\NP$-complete.

\section{Circuit Complexity}
Circuit is a model of computation having a finite number of input
bits and finite number of output bits and having finite number of
NOT, AND, OR gates in the middle of it. Formally, circuit can be
thought as a directed, acyclic graph. Then a circuit consists of
three types of nodes: \BE
\item input nodes: input nodes have 0 In-degrees and (finitely many) Out-degrees. They are labeled with
$x_1, x_2, x_3,\ldots x_n$.
\item computation nodes: There are three types of computation nodes.
NOT nodes have 1 In-degrees and 1 Out-degrees, and AND and OR nodes
have 2 In-degrees and 1 Out-degrees.
\item output nodes: output nodes have 1 In-degrees and 0
Out-degrees. \EE

Now we will define the circuit complexity of a language $L$.
\begin{definition}
For a family of circuits $\{C_n\}$ with $C_n\ :\
\{0,1\}^n\rightarrow \{0,1\}$,  we say $\{C_n\}$ decides language
$L$ if for all $n\in \mathbb{N}$, $L\cup\{0,1\}^n=C_n^{-1}(1)$.
\end{definition}
\begin{definition}
Define the size of $C_n$(denoted by $|C_n|$) to be the number of
gates(nodes) in the circuit $C_n$.
\end{definition}
 Note that sometimes one may define the size of $C_n$ to be
the number of wires(edges), but one measure is polynomial to the
other. With these definitions we can define the circuit complexity
of a given language $L$ as usual. \begin{definition} We say a
language $L$ has circuit complexity $f(\cdot)$ if there exists a
circuit family $\{C_n\}$ deciding $L$ and for all $n\in\mathbb{N},\
\ |C_n|=f(n)$.\end{definition} Then why do we do the circuit
complexity? That's because sometimes it provides some tools for
comparing some complexity classes. We will discuss this more in
detail in the next lecture.

\end{document}
