We consider PAC learning of probability distributions (a.k.a. density estimation), where we are given an i.i.d. sample generated from an unknown target distribution, and want to output a distribution that is close to the target in total variation distance. Let F be an arbitrary class of probability distributions, and let Fk denote the class of k-mixtures of elements of F. Assuming the existence of a method for learning F with sample complexity m(ε), we provide a method for learning Fk with sample complexity O((k.log k .m(ε))/(ε2)). Our mixture learning algorithm has the property that, if the F-learner is proper and agnostic, then the Fk-learner would be proper and agnostic as well. This general result enables us to improve the best known sample complexity upper bounds for a variety of important mixture classes. First, we show that the class of mixtures of k axis-aligned Gaussians in Rd is PAC-learnable in the agnostic setting with O((kd)/(ε4)) samples, which is tight in k and d up to logarithmic factors. Second, we show that the class of mixtures of k Gaussians in Rd is PAC-learnable in the agnostic setting with sample complexity Õ((kd2)/(ε4)), which improves the previous known bounds of Õ((k3.d2)/(ε4)) and Õ(k4.d4/ε2) in its dependence on k and d. Finally, we show that the class of mixtures of k log-concave distributions over Rd is PAC-learnable using Õ(k.d((d+5)/2)ε(-(d+9)/2)) samples.
Published Date: 2018-02-08
Registration: ISSN 2374-3468 (Online) ISSN 2159-5399 (Print)
Copyright: Published by AAAI Press, Palo Alto, California USA Copyright © 2018, Association for the Advancement of Artificial Intelligence All Rights Reserved.