\documentclass[10pt]{article}
\newtheorem{define}{Definition}
%\newcommand{\Z}{{\mathbb{Z}}}
%\usepackage{psfig}
\usepackage{amssymb}
\oddsidemargin=0.15in
\evensidemargin=0.15in
\topmargin=-.5in
\textheight=9in
\textwidth=6.25in
\newcommand{\pmone}{\{-1,1\}}
\begin{document}
\input{preamble.tex}
%{lecture number}{lecture date}{Ronitt Rubinfeld}{Your name}
\lecture{4}{February 19, 2008}{Ronitt Rubinfeld}{Jacob Scott}
%%%% body goes in here %%%%
\section{Review}
For Fourier analysis, we showed that all functions $f: \pmone^n \rightarrow \pmone$ can be represented as $f(x) = \sum_S \hat{f}(S) \chi_{S}(x)$. Here $\hat{f}(S) = \frac{1}{2^n} \sum f(x) \chi_S(x)$, and $\chi_S(x) = \prod_{i \in S} x_i$. From these definitions we state the following facts:
\begin{fact} $\chi_S(x) \cdot \chi_T(x) = \chi_{S \Delta T}(x)$, where $\Delta$ is symmetric difference.
\end{fact}
\begin{fact} If $f = \chi_S(x)$, then $\hat{f}(S) = 1$, and $\forall T \neq S, \hat{f}(T) = 0$.
\end{fact}
\begin{fact} $\hat{f}(S) = 1 - 2{\rm dist}(f, \chi_S)$. Equivalently, ${\rm dist}(f, \chi_S) = \frac{1 - \hat{f}(S)}{2}$.
\end{fact}
We also restate the Boolean case of Parseval's Theorem:
\begin{theorem} $1 = E[f(x)^2] = \sum_S \hat{f}(S)^2$
\end{theorem}
% \section{Administrivia}
% Homework can be done in groups of arbitrary size, but everyone must write up their solutions independently, and understand what they have written. The first homework is due next week, and all five problems should be turned in on separate pieces of paper. Students will be responsible for grading one problem on some problem set during the term, and writing up the solutions (which may be someone's \LaTeX\ if they agree).
%
% The grading scheme is $\checkmark -, \checkmark, \checkmark +$. The vast majority of the class should be receiving $\checkmark$, and in particular $\checkmark +$ should be given to at most three or four students per problem. Remember that the grading system is fairly relaxed, so don't worry about this much.
\section{Dictator Functions}
Last lecture we saw linear function testing in constant time, independent of the domain size of the function. Our tests passed all linear functions, and rejected all functions that were far from linear. It may have seemed like Fourier analysis was tailor made for handling linear functions, but this is not the case. Today, we will see how to apply it to testing dictator functions.
\begin{definition} The \textbf{dictator functions} are $\chi_{\{1\}}$, $\chi_{\{2\}}$, and so on. That is, a function $f$ is a dictator if and only if $f(x) = x_i$ for some $i$.
\end{definition}
All dictator functions are linear and have degree one. We will relax the notation and refer to $\chi_{\{i\} }$ as $\chi_i$. Dictator functions have applications to probabilistically checkable proofs (PCP), and better tests give better hardness of approximation results. Researchers in this area care about improving the constants involved.
\subsection{H{\aa}stad's test}
We will start with an analysis of H{\aa}stad's Test (the motivation for which will appear shortly). Given $\delta$ and queries to a blackbox function $f$, pick $x, y \in \pmone^n$ uniformly at random. Also pick $w \in \pmone^n$ randomly, but from a $\delta$-biased distribution (e.g., $Pr[w_i = -1] = \delta = 1 - Pr[w_i = 1]$). Let $z = x \cdot y \cdot w$, and accept $f$ if $f(x)f(y)f(z)= 1$, rejecting otherwise.
\begin{theorem} $Pr[$H{\aa}stad's Test accepts $f] = \frac{1}{2}+\frac{1}{2}\sum_S (1-2\delta)^{|S|} \hat{f}(S)^3$
\end{theorem}
\begin{proof}
Let $1_H$ be an indicator random variable for the result of H{\aa}stad's test. Then, $ E[1_H] = \frac{1}{2} + \frac{1}{2}E[f(x)f(y)f(z)] = Pr[$H{\aa}stad's Test accepts $ f] $. Then,
\begin{eqnarray*}
E[f(x)f(y)f(z)] &=& E\left[ \left(\sum_S \hat{f}(S)\chi_S(x) \right)
\left(\sum_T \hat{f}(T)\chi_S(y) \right) \left(\sum_U \hat{f}(U)\chi_S(x\cdot y \cdot w) \right) \right] \\
&=& \sum_S \sum_T \sum_U \hat{f}(S) \hat{f}(T) \hat{f}(U) E \left[\chi_S(x) \chi_T(y) \chi_U(x\cdot y \cdot w) \right]
\end{eqnarray*}
If we focus on $ E \left[\chi_S(x) \chi_T(y) \chi_U(x\cdot y \cdot w) \right]$, we see that by Fact 1,
$$E \left[\chi_S(x) \chi_T(y) \chi_U(x\cdot y \cdot w) \right] = E[\chi_{S \Delta U}(x) \chi_{T \Delta U}(y)\chi_U(w)] = E[\chi_{S \Delta U}(x)] E[ \chi_{T \Delta U}(y)] E[ \chi_U(w)],$$
where the last equality follows because the values are now independent. Note that unless $S = T = U$, the product of these expectations is zero, because they are positive half the time and negative half the time (see Lecture 3). When $S = T = U$, $E[\chi_{S \Delta U}(x)]$ and $ E[ \chi_{T \Delta U}(y)]$ both equal 1, and $E[ \chi_U(w)] = \Pi_{i\in U} E[w_i] = (1 -2\delta)^{|U|} = (1 -2\delta)^{|S|}$.
Finally, we get that
$$ E[f(x)f(y)f(z)] = \sum_S \sum_T \sum_U \hat{f}(S) \hat{f}(T) \hat{f}(U) E \left[\chi_S(x) \chi_T(y) \chi_U(x\cdot y \cdot w) \right] = \sum_S\hat{f}(S)^3 (1 -2\delta)^{|S|}.$$
\end{proof}
\subsection{Building a dictator tester}
Our goal is to show that there exists a testing algorithm $A$ with some (as yet unestablished) query complexity such that if $f$ is a dictator, $Pr[A {\rm \ passes\ }f] \geq 1- \beta$, and if $f$ is $\epsilon$-far from a dictator, $Pr[A {\rm\ rejects\ }f] \geq 1 - \beta$. We assume without loss of generality that $\epsilon < .01$.
We start from the following test. Run H{\aa}stad's test $O(\frac{1}{\epsilon^2}\log\frac{1}{\beta})$ times with $\delta = 3\epsilon/4$. Accept $f$ if it passes H{\aa}stad's test $1-.85\epsilon$ fraction of the time.
If $f$ is a dictator, then we know $f= \chi_S$ for some $S$ with $|S| = 1$. Thus the probability that $f$ passes H{\aa}stad's test is $\frac{1}{2}+\frac{1}{2}(1 - 2\delta) = 1 - \delta = 1 - 3\epsilon/4$. If $f$ is $\epsilon$-far from being a dictator or the function always equal 1, we will show that it passes with probability $\leq 1 - \epsilon$. This leaves an $\epsilon/4$ gap, which can be exploited by standard Chernoff bounds to get the $1-\beta$ confidence that we desire (e.g., by setting our acceptance threshold to $1-.85\epsilon$).
\begin{lemma} If $f$ is $\epsilon$-far from all dictator functions and the function $f(x) = 1$, then the probability that it passes H{\aa}stad's test is $\leq 1 - \epsilon$.
\end{lemma}
\begin{proof}
Suppose $f$ is $\epsilon$-far from all dictator functions and the function $f(x) = 1$, but $Pr[$H{\aa}stad's test passes $f] \geq 1 - \epsilon$. We then have (after some simple algebra) that
$$1 - 2\epsilon \leq \left( \sum_S \hat{f}(S)^2 \right) \cdot \left( \max_S (1-2\delta)^{|S|} \hat{f}(S)\right).$$
By Parseval's, this gives that $1 - 2\epsilon \leq \max_S (1-2\delta)^{|S|} \hat{f}(S)$. This implies that there exists $S$ such that $\hat{f}(S) \geq 1-2\epsilon$, and by Fact 3, ${\rm dist}(f, \chi_S) \leq \frac{1 - (1 - 2\epsilon)}{2} = \epsilon$. What remains is to show that $|S| \leq 1$, so that $\chi_S$ is either a dictator or the all ones function.
Note that by our choice of $\delta = 3\epsilon/4$, we have that $(1-2\delta)^{|S|} = (1 - 6\epsilon/4)^{|S|}$, which is $< 1 - 2\epsilon$ when $|S| > 1$. Thus the $S$ that maximizes the above expression must have $|S| \leq 1$, so that $\chi_S$ is either a dictator or the all ones function. Hence, $f$ is $\epsilon$-close to one of the functions it should be far from.
\end{proof}
How can we augment our test to ensure that functions $\epsilon$-close to $f(x) = 1$ do not pass our test? Note that all dictator functions take on 0 half the time, and 1 half the time. One way would be to first run $f$ on $O(\log \frac{1}{\beta})$ random inputs and reject it if more than $3/4$ fraction of the results are 1.
By the Chernoff bound, this enables us to distinguish functions that are $1/6$-close to $f(x)=1$
from those that are $1/6$-close to a dictator function.
%
% This will almost always catch functions $\epsilon$-close to $f(x) = 1$, and the probability that a dictator function is incorrectly rejected is exponentially small in the number of queries.
\section{Voting}
Next, we consider at a high level how some of the techniques we have seen so far apply to fair voting schemes. First we present the \textbf{Condorcet method} for ranking three candidates. Given an election with $n$ voters and three candidates, we would like a fair global ordering.
We are given three strings $S_1, S_2, S_3 \in \pmone^n$. The $i$th bit of $S_1$ gives voter $i$'s preference of candidate $A$ compared to candidate $B$ (e.g. $S_1[i] = 1$ means voter $i$ prefers candidate $A$ to candidate $B$). Similarly, the $i$th bit of $S_2$ gives $B$ compared to $C$ and the $i$th bit of $S_3$ gives $C$ compared to $A$.
Our goal is to produce an aggregate function $f: \pmone^n \rightarrow \pmone$ so that $(f(S_1), f(S_2), f(S_3))$ gives a global preference. We require that this preference is \emph{rational}, by which we mean acyclic, which it is so long as the three preferences are not all equal (NAE). An aggregate function $f$ is a considered rational if for all $S_1, S_2, S_3$, $(f(S_1), f(S_2), f(S_3))$ is NAE.
\begin{theorem}[Arrow's Impossibility Theorem] The only rational monotone aggregate functions are dictators.\footnote{the only non-monotone rational aggregate functions are \textsl{anti-dictators}.}
\end{theorem}
We can test whether or not a function $f$ is rational aggregate by choosing $x,y,z \in \pmone^n$ uniformly at random, and accepting if $(f(x), f(y), f(z))$ is NAE, and rejecting otherwise. The probability that a function $f$ passes this test is
$$\frac{3}{4} - \frac{3}{4}\sum_{S} \left(\frac{-1}{3}\right)^{|S|} \hat{f}^2.$$
Dictators pass this test with probability 1, while constant functions always fail. Parity functions on two bits succeed with probability $\frac{2}{3}$. We can thus test if $f$ is a dictator by testing if it is linear (see last lecture) and if it is a rational aggregate.\\
One thing to take away from this lecture is that in terms of what we have seen, the creativity lies more in constructing the test than doing the analysis, which tends to be boilerplate. Next lecture we go beyond testing and look at computational learning.
\end{document}