\documentclass{article}
\usepackage{amssymb}
\begin{document}
\input{preamble.tex}
\lecture{22}{April 27, 2005}{Madhu Sudan}{Paul Valiant}
\newcommand{\UNSAT}{\rm UNSAT }
\renewcommand{\P}{\rm P }
\newcommand{\NP}{\rm NP }
\newcommand{\BPP}{\rm BPP }
\newcommand{\TIME}{\rm TIME }
\section{Outline}
This lecture consists of two parts. First we discuss some of the
feedback given on the previous four lectures. Then we develop the
notion of randomness, from its roots in physics, information theory,
and cryptography, to its applications to complexity theory. We aim
towards some version of ``BPP=P'', namely that randomness ``is'' or
``should be'' free.
\section{Lecture Comments}
The complete comments are posted online at \[{\small
\texttt{http://theory.csail.mit.edu/\~{}madhu/ST05/comments18-21.txt}}\]
We note here a few things of general interest that were brought up.
\subsection{Self-correction}
Recall that in the PCP framework, when we wanted to evaluate a
function $f(a)$ given a table of function evaluations, we did not
simply query the value indexed by $f$, but rather constructed a
random function $f_1$, queried the table at the entries indexed by
$f+f_1$ and $f_1$, and took their difference.
Similarly, when we wished to evaluate $(f\cdot g)(a)$ we did not
query the table at $fg$ but rather constructed a random function
$h$, and took the difference between the values indexed by $fg+h$
and $h$.
The reason for this is the following. We can show statements about
PCPs along the lines of ``if a string is close to an honestly
generated PCP then the theorem must be true''. Further, we ask the
verifier to probabilistically estimate closeness of a given string
to an ``honest'' PCP by conducting random tests and seeing if any
fail; if they all succeed, then the verifier can be fairly sure the
string lies close to an ``honest'' PCP and the theorem is true.
Thus it had better not be the case that by changing a tiny fraction
of the entries of a PCP one could significantly change the results
of the verifier's tests.
Consider, for example, what would happen above if we queried $fg$
directly. Note that since $f$ and $g$ take values in $\{0,1\}$,
their product will have expected value $\frac{1}{4}$ when $f$ and
$g$ are drawn randomly. The fraction of $\{0,1\}$-functions that
have expected value close to $\frac{1}{4}$ is tiny compared with the
total number of $\{0,1\}$-functions. Thus it had better not be the
case that values from this tiny set are queried with high
probability. Specifically, we cannot query $fg$ directly.
This suggests the trick of making the distribution of queries
uniform by adding an independent random function $h$. In this way we
can ensure that the PCP is sound.
This is an instance of what is more generally referred to as a
``self-correcting'' probe, i.e. a probe that corrects its own
mistakes. This is an area with many applications outside the scope
of this class.
\subsection{Approximation Ratios}
Recall that we showed the PCP theorem to be equivalent to the
nonexistence of an $\alpha$-approximation to the
max-$k$-SAT-$\Sigma$ problem unless $\P=\NP$. The question was
raised whether we know anything more specific about the
approximation ratio of the max-$k$-SAT-$\Sigma$ problem.
We note that a clause of such a problem depends on at most $k$ bits,
so will be satisfied by a random assignment with probability at
least $2^{-k}$. This implies we have a $2^k$ approximation
algorithm for max-$k$-SAT-$\{0,1\}$.
This bound looks awful. However, a recent result by Samorodnitsky
and Trevisan shows that one cannot do better than $2^{k-\sqrt{k}}$,
unless $\P=\NP$.
To put this in context, we note that the existence proof of 3-query
PCPs implies that one cannot do better than 2-approximate
max-3-SAT-$\{0,1\}$ unless $\P=\NP$, while we could do a
$\frac{1}{8}$ approximation by just guessing. Thus our three
queried bits have (in a sense) an efficiency of $\frac{1}{3}$, since
these three bits give us 1 factor of 2 soundness. The surprising
fact is that with each additional bit of the proof, we can almost
double the completeness/soundness ratio, since $2^{k-\sqrt{k}}$ is
so close to $2^k$.
\subsection{Dinur's Result}
There was some confusion about various aspects of Dinur's result.
Everyone agreed the proof went too fast. More details will be
covered in an optional session this Friday (April 29).
Also, there was some discussion of \emph{which} of her
techniques/results were new/interesting. Madhu cited as a key
feature of her proof a new connection between expanders and
randomness amplification.
\subsection{Computationally Sound Proofs}
In class so far we have seen a wide variety of notions of what a
``proof'' is. The standard notion allows us to prove anything in
NP. Interactive proofs allow us to prove anything in the (much
bigger?) class PSPACE, but at the cost of relaxing the notion of
soundness to be only statistical; i.e., unless the prover gets very
lucky, he will not be able to prove a false statement.
We may relax the soundness criteria even further to be
\emph{computational}. Namely, a computationally sound proof
(CS-proof) is one where the prover cannot prove a false statement
unless he either gets really lucky, or expends an inordinate amount
of time. We note that the relaxation from ``statistical'' to
``computational'' will appear later in this lecture with regards to
a definition of randomness, and is an important technique.
In the proof paradigms we have considered so far, we have generally
disregarded the computational difficulties involved with the
prover's side of the picture. Here they play a central role. In
Arthur-Merlin protocols, by contrast, Merlin, the prover is
portrayed as ``magical''. The general idea is that a computationally
bounded verifier might trust a prover more if he knows the prover is
similarly computationally bounded and cannot perform ``magic'' to
fool him.
Using this notion, Micali demonstrated how to construct polynomial
sized CS-proofs of any statement in EXPTIME. Unfortunately, his
results, unlike those of the PCP theorem, rely on some extremely
strong cryptographic assumptions, specifically something known as
the \emph{random oracle model/methodology}. The details of this are
beyond the scope of this lecture.
\section{Randomness}
We move on to our main topic: randomness. We have discussed
randomized algorithms, constructions proofs, etc. but have not
discussed what randomness \emph{means}, nor how to \emph{find} it.
One place to start is with physics. We believe randomness exists
because physicists say that there is randomness. Physicists cite a
device called a ``zener diode'', which they tell us behaves
randomly.
(We note however, that just because physicists say something exists
does not mean we should interpret it as ``free'' in our model of
computation. Physicists say that quantum computing exists, and that
is a far from easy assumption for complexity theorists to
assimilate.)
At this point, we must consider what physicists mean by ``random''.
In the case of the zener diode, they mean \emph{not completely
predictable}. If we measure the voltage across a zener diode at
regular intervals, we get a sequence that is not completely
predictable. However, if we measure at short enough intervals, we
notice that the voltage changes only so fast, and there is
significant serial correlation. In other words, physicists do not
guarantee \emph{uniform} randomness, of the kind we need for BPP and
other complexity theory tasks.
We might call randomness without such a guarantee ``weak''
randomness. There is a developing computer science field of
\emph{randomness extractors}, where the goal is to extract ``pure''
randomness from weak randomness.
We back up here and ask again what we mean by randomness. Consider
the following sequence: \[010110111000011011110010001.\]
Is it random? What about the following sequence:
\[000000000000000000000000000?\]
We might consider a hypothetical ``Turing test'' for randomness,
namely we put a person in a room who tries to generate random
numbers, and we are asked to distinguish between his sequences and
those produced by a coin. Which of the above sequences would pass
the test? Some of the class said that the all-zeros sequence would
never be seen from a random coin; someone else said that no person
would be stupid enough to try to pass off an all-zeros sequence as
random.
In light of this disagreement, we consult the experts.
\begin{itemize}
\item Shannon's answer is \emph{no}: no single string can be random;
randomness is a property of distributions.
\item Kolmogorov's answer is \emph{no}: no finite string can be random; he
defines a string as random if no finite length (deterministic) program could
produce it.
\item Adleman's answer is \emph{maybe}: recall his proof that
$\BPP\subseteq \P|_{poly}$; the technique was to find a
\emph{single} string of length $n^2$ that could stand in for all
the randomness in the uniform distribution on $\{0,1\}^n$, at
least in the context of the fixed BPP algorithm $A$ under
consideration.
\end{itemize}
The interesting part of Adleman's definition is that the notion of
randomness is conditional on the use to which it will be put.
Namely, whether a string is random or not depends on the algorithm
$A$ we wish to ``fool''.
We clarify the model here before moving on.
\subsection{Pseudo-Random Generators}
A pseudo-random generator (PRG) is a (deterministic) algorithm $G$
that takes a ``short'' random seed $\sigma$ as input, and produces a
longer string as output, where the output meets a criteria for
randomness that we will define later.
In formal notation, we have \[G:\{0,1\}^l\rightarrow \{0,1\}^n,\]
where $lD_2(x)} D_1(x)-D_2(x).\]
We note that the statistical difference is symmetric:
\[|D_1-D_2|=\sum_{x:D_1(x)0,c$ such
that language $L$ on inputs of length $l$ is decidable in time
$2^{cl}$, but not decidable by circuits of size $2^{\epsilon l}$
then $\BPP=\P$.
\end{claim}
Thus circuit complexity lower bounds could prove that $\BPP=\P$. We
note that it is crucial here that the circuits are allowed to be
non-uniform, because otherwise $\BPP=\P$ would follow from the time
hierarchy theorem.
We will prove this claim in the next lecture.
\end{document}