"Scale-sensitive Dimensions, Uniform Convergence, and Learnability,"
J. ACM 44 (4) 615-631 (1997) (with N. Alon, S. Ben-David and N. Cesa-Bianchi)
Abstract:
Learnability in Valiant's PAC learning model has been shown to be strongly related to the
existence of uniform laws of large numbers. These laws define a distribution-free convergence
property of means to expectations uniformly over classes of random variables. Classes of real-valued
functions enjoying such a property are also known as uniform Glivenko-Cantelli classes. In this paper
we prove, through a generalization of Sauer's lemma that may be interesting in its own right, a new
characterization of uniform Glivenko-Cantelli classes. Our characterization yields Dudley, Gin\'e, and
Zinn's previous characterization as a corollary. Furthermore, it is the first based on a simple
combinatorial quantity generalizing the Vapnik-Chervonenkis dimension. We apply this result to
obtain the weakest combinatorial condition known to imply PAC learnability in the statistical
regression (or ``agnostic'') framework. Furthermore, we show a characterization of learnability in the
probabilistic concept model, solving an open problem posed by Kearns and Schapire. These results
show that the accuracy parameter plays a crucial role in determining the effective complexity of the
learner's hypothesis class.