On Computational Hardness of Agnostic Learning

Vitaly Feldman (Harvard)

The agnostic learning framework of Kearns, Schapire and Sellie is an extension of Valiant's PAC learning model, in which examples that are given to the learning algorithm are labelled by an unknown and unrestricted (and hence adversarial) function. An agnostic learning algorithm for a class of functions $C$ is required t produce a hypothesis whose error (with respect to the distribution of examples) is "close" to the optimal possible by a function from $C$. A large number of questions regarding the learnability of even the simplest function classes in this natural model remain open since the introduction of the model in 1992.

In this talk I will show that agnostic learning of parities with respect to the uniform distribution reduces to learning of parities with random classification noise. Learning with random noise is a substantially more tractable model and, in particular, using a noise-tolerant parity learning algorithm by Blum, Kalai and Wasserman we obtain the first non-trivial algorithm for agnostic learning of parities with respect to the uniform distribution. The reduction also shows that learning of juntas and DNF expressions with respect to the uniform distribution can be reduced to learning parities with random classification noise.

The second problem we'll discuss is the problem of weak agnostic learning of monomials by a proper learning algorithm (that is, an algorithm that produces a monomial as its hypothesis) formulated by Avrim Blum. We resolve this open problem by showing that it is NP-hard. In other words, we prove that finding a monomial that is consistent with $1/2+\epsilon$ fraction of examples is NP-hard even when there exists a monomial consistent with $1-\epsilon$ fraction of examples (for any constant $\epsilon$). The result can also be interpreted as an optimal hardness of approximation result for maximizing agreements with a monomial. It substantially improves on previous hardness of approximation results for this problem (due to Ben-David et al., and Bshouty and Burroughs).

The talk will be largely self-contained and both results will be also interpreted outside of the learning context.

Parts of this talk are based on joint work with Parikshit Gopalan, Subhash Khot and Ashok Ponnuswami.

Host: Ronitt Rubinfeld

(The Algorithms and Complexity Seminar series talks are usually held Mondays from 4-5:15pm in 32-G575.)