CS294-152 Lower Bounds 10/29/18 === Announcements: *** Your project proposal is due TODAY! *** Fill out the "When2Meet" (which I emailed to the PhD students) to specify what times you are available for an extra lecture. (I'll announce to everyone when the lecture date/time has been set.) *** Remaining schedule: Nov 5 - lecture Nov 7-9? or 28-29? - lecture Nov 12 - no lecture (Veteran's Day) Nov 19 - two guest lectures (Amir Y and Kasper Green L.) Nov 26 - your project presentations! *** Questions about your project? Don't hesitate to email me! --- ACC0: A Nice Polynomial Representation The best general lower bound we know for ACC0 is much weaker than what is known for AC0[p]: Thm [MW18] There are functions in NTIME(n^{polylog n}) which are not in poly-size ACC0. This lower bound also exploits a polynomial representation for ACC0. However, this representation is not quite as "nice" as the probabilistic/approximate polynomial representation we saw for AC0[p]. Today we'll discuss this polynomial representation. Later, we will see how to use it for lower bounds. SYM = Symmetric Boolean function, a function that depends only on the number of 1s in the input. Formally, f : {0,1}^n -> {0,1} is symmetric if there is a g : {0,1,...,n} -> {0,1} such that for all x, f(x) = 1 <=> g(sum_i x_i) = 1. Thm [Beigel-Tarui 91 (building on Yao)] For every symmetric function f, there is a symmetric function f' s.t. f of s AC0[m] circuits of size s and depth d can be represented as a f' of s^{polylog(s)} ANDs, where the polylog factor depends on d and m. Moreover the transformation can be computed in (deterministic) s^{polylog(s)} time, given a starting "f of ACC" circuit as input. Sketch of Simpler Beigel-Tarui (still technical!): AC[6] size-s depth-d can be represented as a SYM of s^{polylog(s)} ANDs. (Note, this circuit class is already very hard to prove lower bounds for, even for d=3) With a probabilistic construction. (Can be derandomized) Some optimization due to Chen and Papakonstantinou (2016). 0. Note: can write OR as MOD2 of AND of MOD2 (DeMorgan's laws) 1. First, think of the AC0[6] circuit C as being ``layered'' (similar to what we did for AC0): Layers of (MOD3 of MOD2 of AND) repeated <= d times, by sticking in extra gates that don't really do anything. Size is still <= s^{c} for some c <= O(d). 2. Construct 10n circuits C_i as follows: 2.1 Replace each AND (over {0,1}) with a probabilistic F_3-polynomial of degree 2c*log(S). Has error <= 1/s^{2c}, so overall the probabilistic circuit has error <= s^c/s^{2c} <= 1/s^c. Can write each such AND (with fan-in S) as a probabilistic polynomial as a MOD3 of S^{2c log(S)} ANDs, each AND has fan-in <= 2c*log(S). 2.2 Next, write each MOD3 of AND of fan-in <= 2c*log(S) as a MOD3 of S^{2c} MOD2s of fan-in <= 2c*log(S). In particular, the AND on v variables can be written as a sum (mod 3) of <= 2^v MOD2's: AND(x_1,...,x_v) = sum_{S subset of [v]} c_S MOD2(sum_{i in S} x_i) mod 3, for some c_S in F_3. (Why can we do this? *Every* function on v variables can be written in the Fourier basis, i.e. as a real-valued sum of <= 2^v MOD2s on v variables. Moreover, each "Fourier coefficient" for these MOD2s has the form j/2^v, where j in [-2^v,...,2^v]. In F_3, we can divide by 2, so the answer is preserved over F3.) 2.3 Now each C_i is <= 2d layers of (MOD3 of MOD2). 3. Let D = MAJORITY(C_1,...,C_{10n}). Chernoff bound ==> Over each input x of length n, Pr[D(x) != C(x)] << 1/2^n. Union bound ==> Pr[exists x D(x) != C(x)] << 1. So D is correct on all 2^n inputs with good probability. 4. Now we want to transform the MAJORITY of (MOD3 of MOD2)^* into a SYMMETRIC function of ANDs. In fact, we will give a generic way to take: - any SYM of MOD3 of MOD2 and convert it into a SYM of MOD2 of "small" fan-in, and any SYM of MOD2 of MOD3 and convert it into a SYM of MOD3 of "small" fan-in. Concretely, suppose we have a SYM of t MOD3 of subcircuits C_{i,1},...,C_{i,u} which output 0/1, where each C_{i,j} is a MOD2 of some inputs. Note there are t*u subcircuits in total. Let the symmetric function be f, which has t inputs. Then total circuit is equivalent to: f(MOD3(sum_{j=1}^u C_{1,j}),...,MOD3(sum_{j=1}^u C_{t,j})) We want to convert this into something of the form: g(AND,...,AND) where each AND is over some subset of the inputs. The magical mathematical objects we will use to do this are called Modulus-Amplifying Polynomials. Let L in {1,2,...}. Def. Polynomial P_L(x) is L-mod-amplifying if for all m, x \in Z, where m >= 2, x = 0 mod m ==> P_L(x) = 0 mod m^L x = 1 mod m ==> P_L(x) = 1 mod m^L. P_L "amplifies the modulus from m to m^L." Key Property: for a prime p and for all a in Z, a^{p-1} = 0 if a = 0, and = 1 otherwise, so we have: MODp(a) = (1-a^{p-1}) mod p = P_L(1-a^{p-1}) mod p^L. (!) (Recall MODp(x) = 1 iff x is divisible by p.) Thm [B-T] For every L, there's an L-mod-amplifying P_L of degree <= 2L. Can be efficiently constructed. (Omitted) Set L:= 1+log_3(t). Consider: (*) sum_{i=1}^t P_L(1-(sum_{j=1}^u C_{i,j})^2) By the mod-amplifying property, (*) = sum_{i=1}^t P_L(1-(sum_{j=1}^u C_{i,j})^2) mod 3^L (since t < 3^L) = sum_{i=1}^t MOD3(sum_{j=1}^u C_{i,j}) mod 3^L (by (!)) Key Observation: Since t < 3^L, the sum (*) = t' mod 3^L, where t' is the number of MOD3 gates going into f which are true. That means we can determine the output of the original circuit f(MOD3(sum_{j=1}^u C_{1,j}),...,MOD3(sum_{j=1}^u C_{t,j})) from (*). That is, there is a function g : Z -> {0,1} such that f(MOD3(sum_{j=1}^u C_{1,j}),...,MOD3(sum_{j=1}^u C_{t,j})) = g(sum_{i=1}^t P_L(1-(sum_{j=1}^u C_{i,j})^2). Writing out the monomials of the polynomial P as a sum of ANDs of fan-in <= 2L, we get a SYM of ANDs of fan-in 2L <= O(log t) of the subcircuits C_{i,1},...,C_{i,u}. 5. Write the ANDs of fan-in O(log t) as a sum of <= poly(t) MOD2s of fan-in O(log t). Then we have a SYM of MOD2s of C_{i,1},..,C_{i,u} where each C_{i,j} is a MOD2. But MOD2 of MOD2 = MOD2, so this is just a SYM of MOD2s of fan-in O(log t). We have completely eliminated the MOD3 layer just below the SYM function! 6. We can do a similar thing for a SYM of MOD2 of MOD3s: -> Use the mod-amplifying polynomial with a power 2^L instead of 3^L. -> Get a SYM of ANDs of fan-in <= 2L of MOD3s. Next, we can write the AND of fan-in <= 2L as a sum modulo 2 of O(3^{2L}) MOD3s (writing AND in the "MOD3" basis, I'm omitting the proof here, but it is not hard to show!). However, we can't just merge the layers of MOD3; we don't have MOD3 of MOD3 = MOD3 easily. What we can say is: MOD3 of s MOD3 of t can be written as MOD3 of s*t^2 ANDs of fan-in 2. Idea: "square" the sum in each bottom MOD3, to force it to output a 0/1 value. (for all a in {0,1,2}, a^2 = 0 or 1 mod 3) Then sum up those squares modulo 3. Using AND of 2 |--> sum (mod 3) of O(1) MOD2s of 2, we can get: MOD3 of s*t^2 ANDs of fan-in 2 |--> MOD3 of O(s*t^2) MOD2s of 2. So now we are back to SYM of MOD3 of MOD2 again, with only polynomial blowup. Example: suppose we start with the circuit AND of s MOD2 of s MOD3 of s MOD2 of s -- Use probabilistic polynomial for AND: MAJORITY of O(n) (MOD3 of s^{O(log s)} MOD2s of log(s)) MOD2 of s MOD3 of s MOD2 of s = MAJORITY of O(n) MOD3 of s^{O(log s)} MOD2 of s*log(s) MOD3 of s MOD2 of s Use mod-amplifying polynomial on MAJORITY of MOD3: = SYM of s^{O(log s)^2} AND of O(log s)^2 MOD2 of s*log(s) MOD3 of s MOD2 of s -- Convert AND into MOD2s: SYM of s^{O(log s)^2} MOD2 of O(log s)^2 MOD2 of s*log(s) MOD3 of s MOD2 of s -- Merge MOD2s: SYM of s^{O(log s)^2} MOD2 of s*log(s)^3 MOD3 of s MOD2 of s -- Use mod-amplifying polynomial on SYM of MOD2: SYM of s^{O(log s)^3} AND of O(log s)^3 MOD3 of s MOD2 of s -- Write ANDs as sum of MOD3s: SYM of s^{O(log s)^3} MOD3s of O(log s)^3 MOD3 of s MOD2 of s -- "Merge MOD3s": Write MOD3 of s MOD3 of t as MOD3 O(s*t^2) of MOD2s of 2: SYM of s^{O(log s)^3} MOD3 of O(s^2(log s)^3) MOD2 of s -- Use mod-amplifying polynomial on SYM of MOD3: SYM of s^{O(log s)^4} MOD2s of O(log s)^4 -- Finally, we could convert each MOD2 to a sum of 2^{O(log s)^4} ANDs. Note that for a circuit of depth d, we get a SYM of s^{O(log s)^{O(d)} ANDs. ===== Algorithms for Analyzing ACC0 What can we do with this SYM of AND representation? There is only one thing we know, which has been proven to actually yield a lower bound: - The SYM of AND can be used to algorithmically *analyze* a given ACC0 circuit, surprisingly fast. And in turn, that circuit-analysis algorithm can be used to prove an ACC0 lower bound. Recall Circuit-SAT: Given a circuit C of n inputs and size s, is there an x such that C(x) = 1? Exhaustive search solves the problem in 2^n*poly(s) time. Can it be solved faster? Turns out the answer is *YES* for ACC circuits! Let AC^0_d[m] = AC circuits of depth d with AND/OR/NOT/MODm gates. Thm [W'11] For all d and m, there is a delta > 0 such that: the SAT problem for AC^0[d] circuits of size 2^{n^{delta}} can be solved in 2^{n-n^{delta}} time. To prove this theorem, we'll use the following classic divide-and-conquer result. Recall a multilinear polynomial has the form: p(x1,...,xn) = sum_{S subset of [n]} c_S (prod_{i in S} x_i), where c_S are coefficients. Lemma: Given a multilinear polynomial p in n variables with m monomials (specified as a list of the coefficients c_S), there is an poly(n)*(2^n + m) time algorithm for evaluating p on all points in {0,1}^n. Imagine m is huge; it could be close to 2^n. Then a naive algorithm for evaluating this 2^n-size representation on all 2^n points would take at least 2^n*2^n = 4^n time! We are getting a quadratic speedup here. Proof: We describe the algorithm. If n=1, simply output [p(0) p(1)]. Otherwise, since p is multilinear, we can write p(x_1,...,x_n) = p_0(x_1,...,x_{n-1}) + x_n*p_1(x_1,...,x_{n-1}). (Constructing p_0 and p_1 just requires sorting the coefficients.) Recursively obtain the 2^{n-1}-length table T_0 for p_0, and the 2^{n-1}-length table T_1 for p_1. Then the 2^n-length table for p_0 is just [T_0 (T_0+T_1)], which can be constructed in O(2^n) time from T_0 and T_1. The running time is R(n) <= 2*R(n-1) + poly(n)*min{2^n,m}, which implies R(n) <= poly(n)*(2^n + m). QED (of Lemma) Proof (of Thm) Let eps > 0 be sufficiently small (to be set later). Given an AC^0_d[m] circuit C of size 2^{n^{eps}}: Let k := n^{eps}. Define the (n-k)-input circuit C'(y) := OR_{a in {0,1}^k} C(a,x). Note: - C is SAT <=> C' is SAT. - C' has depth <= d+1. Now convert C' into a SYM of ANDs D (via Beigel-Tarui). This SYM of ANDs has size quasi-polynomial in 2^{n^{eps}}, which is 2^{n^{c*eps}} for some constant c >= 1. Now construct a multilinear polynomial p(x_1,...,x_{n-k}) where each monomial corresponds to an AND gate of the SYM of ANDs. (So p is taking the sum of the ANDs.) Since D is a SYM of ANDs, there is some function g : {0,1,...,n} -> {0,1} such that D(x) = g(p(x)) for all x. By the Lemma, we can evaluate p on all 2^{n-k}=2^{n-n^{eps}} inputs in poly(n)*(2^{n^{c*eps}}+2^{n-n^{eps}}) time. Set eps = 1/(2c), and the running time becomes poly(n)*(2^{n^{1/2}} + 2^{n-n^{eps}}) <= poly(n)*2^{n-n^{eps}}. Finally, compute g on all 2^{n-k} evaluations of p. This gives the value of D on all 2^{n-k} inputs, which tells us whether D is SAT or not. QED (of Thm) ==== Proving Lower Bounds from Circuit Analysis Algorithms. Recall NEXP = union_k NTIME[2^{n^k}] We showed earlier: Sigma_2 EXP not in P/poly, MAEXP not in P/poly. I'll start with the punchline: Thm [W'11]: For many reasonable circuit classes C (including ACC0) If C-SAT for n^k-size circuits is in 2^n/n^{omega(1)} time, then NEXP not in poly-size C. Thus, using our above ACC0 algorithm we get that NEXP is not in ACC0. Recently improved: [MW18] NTIME[n^{polylog n}] doesn't have AC0[6] circuits of n^{log n} size Note that NEXP not in P/poly already has some non-trivial derandomization consequences: [IKW'02] NEXP not in P/poly => can simulate MA in nondeterministic sub-exptime Instead of jumping into how "SAT algorithms imply lower bounds" is proved formally, I want to take a step back and talk about the big picture: I want to talk about how one might arrive at a statement like "Faster Circuit-SAT implies Circuit Lower Bounds" Major Theme: Use *circuit-analysis algorithms* to prove *lower bounds* against circuits What's a circuit-analysis algorithm? An algorithm solving a circuit-analysis problem. What's a circuit-analysis problem? There are at least two general kinds: (1) Given a description of a circuit, say something interesting about the function it computes Circuit-SAT: Given a circuit C, is C != the all-zeroes function? Min-Circuit: Given a circuit C, is there a smaller circuit equivalent to C? Gap-SAT: Given a circuit C, promised it's either UNSAT or has at least 2^{n-1} SAT assignments, decide which. Problem Complexity Class Best Known (Deterministic) Running Time ---------------------------------------------------------- Circuit-SAT NP 2^n*poly(s), where s = size, n = # inputs Min-Circuit Sigma_2 P 2^{O(s log s)}*2^n*poly(s) Gap-SAT Promise-RP 2^n*poly(s) Note: Any algorithm for Circuit-SAT would also solve Gap-SAT. (2) Given the truth table of a function, say something interesting about its circuit complexity DNF-Minimization: Given f : {0,1}^n -> {0,1} (2^n bits) and integer k >= 1, is there a DNF computing f of <= k clauses? NP-complete! [Masek 79] DNF of XORs also known to be NP-complete [HOS'18] MCSP: Given f : {0,1}^n -> {0,1} (2^n bits) and integer k >= 1, is there a circuit (over AND/OR/NOT) computing f of <= k gates? MCSP is in NP. (Why?) NP-hardness is a major open problem! [MW'15] If MCSP is NP-complete then EXP != ZPP. k-Compression: Given f : {0,1}^n -> {0,1} (2^n bits), you are promised f has circuit complexity <= k, output circuit for f of size << 2^n/n for f Note: An algorithm for k-Compression implies an algorithm for MCSP with parameter k. Best known algorithms for all of these is exhaustive search. It's a major open problem to solve any of them faster! It is known that one can *approximate* the minimum DNF efficiently: [Allender et al '05] If the min-DNF has k terms, can find a DNF with O(k*n) terms in 2^{O(n)} time. (Note, 2^{O(n)} is polynomial in the input length!) So that's circuit-analysis algorithms in a nutshell. More next time!