"General Bounds on the Mutual Information between a parameter
and n conditionally independent observations"
David Haussler and Manfred Opper
Proceedings of the 8th Conference on Computational Learning Theory,
(COLT) Santa Cruz, July 1995, Published by ACM Press.
[PDF]
Abstract:
Each parameter theta in an abstract parameter space Theta is
associated with a different probability distribution on a set Y.
A parameter theta is chosen at random from Theta according to some
a priori distribution on Theta, and n conditionally
independent random variables Y^n = Y_1, . . ., Y_n are observed with
common distribution determined by theta. We obtain bounds on the mutual
information between the random variable Theta, giving the choice of
parameter, and the random variable Y^n, giving the sequence of
observations. We also bound the supremum of the mutual information,
over choices of the prior distribution on Theta. These quantities
have applications in density estimation, computational learning theory,
universal coding, hypothesis testing, and portfolio selection theory.
The bounds are given in terms of the metric and information dimensions
of the parameter space Theta with respect to the Hellinger distance.