6.S078 Lecture 9: k-SUM Algorithms =========================== Announcements: - Open Problem Session again tonight! More progress on Parity-SAT... and the others too? - Lecture notes are still coming! If you have any questions about content from lectures, please post on Piazza. *** k-SUM problem. Given: n integers (positive and negative). Decide: Are there k (distinct) numbers which sum to zero? We'll generally assume numbers fit into a word, so O(1) time additions, comparisons, subtractions... Fact: Let k >= 2. k-SUM on n numbers reduces to O(n) instances of (k-1)-SUM on k-1 parts with n numbers. Proof: Randomly partition numbers into two parts. For each i=1,...,k, each number goes in part 1 with prob 1/k, and part 2 with prob 1-1/k. Suppose there's a k-SUM solution a_1,...,a_k. Then Pr[a_1 is part 1, a_2,...,a_k in part 2] >= 1/k*(1-1/k)^{k-1} >= 1/(ek). (Repeat the random partitioning for O(k) times) For all numbers x in part 1, Add x to all numbers in part 2. Call (k-1)-SUM on part 2. If any call returns "yes" then return yes. QED Theorem: 2-SUM is in O(n) time. Proof. Given a list L of n numbers, make a new list L' = {-a_i | a_i in L}. Want to determine if L cap L' = empty or not. For O(n log n) time: sort numbers in O(log n) time. For each a_i in the list, binary search for -a_i in O(log n) time. Can be improved using hash functions and word tricks. In particular, if: - Each number can be stored in a word, - Can populate a hash table of O(n^2) size with O(n) elements in O(n) time, - Can randomly access any entry of a table in O(1) time, Then we can get O(n) randomized time. More details: Suppose the numbers have m-bit representations, so we can think of each x as vectors of length m. Let -x be the integer -1*x, written as a vector of length m. Pick a random c+2*log n by m Boolean matrix M, for large c. Define h_+(x) = M*x and h_{-}(x) = M*(-x) Fact: For x != y, Pr[Mx = My] = 1/n^2. Suppose there's a 2-SUM solution a_1, a_2 in the n numbers. Then h_+(a_1) = h_{-}(a_2). If there's no solution, then for every a_1, Pr[exists a_1,a_2 s.t. h_+(a_1) = h_{-}(a_2)] <= n^2/2^{c+2*log n} <= 1/2^c. Make lists L' and L'', where L' = {h_+(a_i) | a_i in L} and L'' = {h_{-}(a_i) | a_i in L}. Build a hash table of 2^{c+2*log n} length, whose J-th entry contains i in [n] <=> h_{-}(a_i)=J. Assume we can access any entry in O(1) time, and output its contents in O(L) time where L is the number of words storing the content. Then for each h_+(a_i) in L', can look up the list of j's in the b-th entry of the table, and check if any of them form a real 2-SUM solution with a_j. QED Cor: 3-SUM in O(n^2) time. Proof: Combine Fact and 2-SUM algorithm. But in fact you don't need hash tables for O(n^2) time... Start by sorting in O(n log n) time. For each number a_i in the list, we make two pointers on the sorted list: p1 at the beginning of the sorted list, and p2 at the end. Let b_i be the number at p1, c_i be number at p2. Repeat until the pointers pass each other: If a_i = b_i, move p1 to the right (want distinct numbers) If a_i = c_i, move p2 to the left If a_i + b_i + c_i = 0 then return the triple If a_i + b_i + c_i > 0, then move p2 to the left (to get smaller, we have to decrease c_i) If a_i + b_i + c_i < 0, then move p1 to the right (to get larger, we have to increase b_i) Return "no solution" For each a_i, this procedure takes O(n) time to find the other two. So we get O(n^2) time. QED Fact: Let k >= 2. k-SUM on n numbers reduces to 2-SUM on 2 parts with n^(ceil(k/2)) numbers. Proof: WLOG assume the instance of k-SUM has k parts, and we want to pick exactly one number from each part. Enumerate all n^(floor(k/2)) choices of floor(k/2) numbers from the first floor(k/2) parts, form a list L = {sum_i a_i | a_i is in part i, for all i =1,...,floor(k/2)} Similary, for all n^(ceil(k/2)) choices from the last ceil(k/2) parts, form a list L' = {sum_i a_i | a_i is in part floor(k/2)+i, for all i =1,...,ceil(k/2)}. Now we wish to find a number in L and a number in L' which sum to 0. QED Cor: 4-SUM in O(n^2) time, and k-SUM is in n^(ceil(k/2)) time. k-SUM Conjecture: For every k >= 2, eps > 0, k-SUM cannot be solved in n^(ceil(k/2)-eps) time. Note this implies that for **odd** k, k-SUM and (k+1)-SUM have essentially the same time complexity. However, we don't really know improvements for these problems beyond small log factors... Algorithm for 3SUM. [BDP'05] Below is a different algorithm, using [LVWW'***??] There are roughly three moving parts: 1. Self-reduction for 3SUM -> reduce to small instances. (similar in spirit to OV) 2. Randomized reduction for 3SUM -> reduce *domain* to be small, when instance is small 3. Fast look-up table for small instances with small domain. 1. Self-reduction: [LVWW'***??] Deterministic O~(n log n + n^2/s^2)-time reduction from 3-SUM on n numbers to O(n^2/s^2) instances of 3-SUM on O(s) numbers. Recall this was very easy for OV... not so straightforward for 3-SUM! Self-reduction works in the Real RAM as well! Proof Idea: Sort the numbers, partition the sorted order into O(n/s) "buckets" of O(s) numbers each. Argue that there are at most O(n^2/s^2) triples of buckets that could possibly contain a 3-SUM solution, and these triples can be computed in O~(1) each. I'll use [n] = {-n,-n-1,...,0,1,...,n} (a little non-standard) 2. Randomized reduction: Theorem: For every c >= 1, there is a d >= 3 such that for any integer m there is a family of hash functions H = {h : [2^m] -> [s^d*loglog(m)]} where each h(x) is computable in O~(m) time and for *every* set S of s numbers in [2^m], If S has a 3SUM then Pr_{h in H}[h(S) has a 3SUM among 3 targets] = 1 If S doesn't have a 3SUM then Pr_{h in H}[h(S) has a 3SUM among 3 targets] <= 1/s^c. where the 3 targets are a function of h. Proof: Hash every number in [2^m] modulo a random prime p in [2^t] for some t. Note there are >= Omega(2^t/t) primes in this interval, by the prime number thm. For every triple (a,b,c) of numbers, - If a+b+c=0 then a+b+c=0 mod p. - If a+b+c != 0, then a+b+c <= 3*2^m has at most O(m) prime factors, so Pr[a+b+c = 0 mod p] <= O(mt/2^t). Now pick any set S of s numbers and hash it to h(S). There are at most s^3 triples in h(S) to consider, so Pr[exists a,b,c in S a+b+c != 0 but a+b+c=0 mod p] <= O(mts^3/2^t). Finally, to reduce the hashed set h(S) back to integers again, we cast the s numbers mod p and back to integers in {0,1,..,p-1} and have three calls to 3-SUM on s integers where we look for 3 numbers summing to 0 in one call, sum to p in another call, and 2p in the third. (The total sum of any triple is less than 3p.) Set t = c*log(s)+log(m) for large c >= 1, have error <= log(m)/poly(s). Don't like the dependence on m? Hash again! Now our domain is [s^c*m] instead of [2^m]. When we hash mod a random prime in [2^t] again, Pr[exists a,b,c a+b+c != 0 but a+b+c=0 mod p] <= O(log(s^c*m)ts^3/2^t). Set t = c*log(s)+loglog(m), have error <= loglog(m)/poly(s). Can keep repeating this hashing to drive down the dependence on m, as desired... QED Idea: We can do this hashing repeatedly, until loglog...log(m) <= s. Then our domain dependence is only on s. Note: If we had *real-valued* inputs (and worked on the real RAM) the above hashing tricks would not work at all... Note: This reduction also shows that WLOG we can assume 3-SUM on n numbers is in the domain [poly(n)]. 3. Fast Lookup Table. Fact: There is a data structure of size s^{O(s)} that can answer any 3-SUM instance on s numbers in domain [s^c]. Proof: There are at most s^{O(s)} such yes instances. Write them all down one by one, and compute their answers. Store all answers in a look-up table of s^{O(s)} bits. QED Assume lookup in a table of size T takes time L(T). Usually, L(T) <= O(log T), or L(T) = O(1). Finally... 3-SUM Algorithm: Let s = parameter. 0. Construct look-up table for 3-SUM on s numbers, as in fact. 1. Apply randomized reduction to 3-SUM on all n numbers, mapping them to domain [s^d]. (For *any* subset of O(s) numbers, have probability <= 1/s^3 of error of 3SUM.) 2. Run self-reduction: For each of the O(n^2/s^2) calls to 3-SUM on O(s) numbers, Restrict the O(s) numbers to the domain [s^d], and consult the look-up table. 3. For each call to the self-reduction, the look-up table gives the correct answer with probability >= 1-1/s^3. So we expect <= 1/s^3 fraction of the answers to our O(n^2/s^2) calls to be incorrect. 4. If more than 100 n^2/s^5 calls say "yes" then return "yes". (If "no" instance, we expect <= n^2/s^5 calls to say "yes".) 5. Otherwise search the false positives: For all of the O(n^2/s^5) "yes" calls, search all of the relevant sets of O(s) directly for a subset sum, in O(n^2/s^5*s^2) <= O(n^2/s^3) time. (Note: this is negligible in comparison!) Total running time: O(n^2/s^2) calls * L(s^{O(s)}) time lookup + O(n^2/s^3) false positive search + s^{O(s)} time to set up look-up table. Set s = eps*(log n)/log log n, so s^{O(s)} <= n^{O(eps)}. Then running time is: O(n^2*L(n^{O(eps)})*(log log n)^2/(log n)^2). For L(T) <= O(log T), we have L(n^{O(eps)}) <= O(log n), so we save a log-factor. If L(T) <= O(1), we save a log^2-factor.