Beyond the low-degree algorithm: mixtures of subcubes and their applications

© 2019 Association for Computing Machinery. We introduce the problem of learning mixtures of k subcubes over (0,1)n, which contains many classic learning theory problems as a special case (and is itself a special case of others). We give a surprising nO(log k)-time learning algorithm based on higher...

Full description

Bibliographic Details
Main Authors:	Chen, Sitan, Moitra, Ankur
Format:	Article
Language:	English
Published:	ACM 2021
Online Access:	https://hdl.handle.net/1721.1/138050

_version_	1826207069866295296
author	Chen, Sitan Moitra, Ankur
author_facet	Chen, Sitan Moitra, Ankur
author_sort	Chen, Sitan
collection	MIT
description	© 2019 Association for Computing Machinery. We introduce the problem of learning mixtures of k subcubes over (0,1)n, which contains many classic learning theory problems as a special case (and is itself a special case of others). We give a surprising nO(log k)-time learning algorithm based on higher-order multilinear moments. It is not possible to learn the parameters because the same distribution can be represented by quite different models. Instead, we develop a framework for reasoning about how multilinear moments can pinpoint essential features of the mixture, like the number of components. We also give applications of our algorithm to learning decision trees with stochastic transitions (which also capture interesting scenarios where the transitions are deterministic but there are latent variables). Using our algorithm for learning mixtures of subcubes, we can approximate the Bayes optimal classifier within additive error ϵ on k-leaf decision trees with at most s stochastic transitions on any root-to-leaf path in nO(s+log k) · poly(1/ϵ) time. In this stochastic setting, the classic nO(log k) · poly(1/ϵ)-time algorithms of Rivest, Blum, and Ehrenfreucht-Haussler for learning decision trees with zero stochastic transitions break down because they are fundamentally Occam algorithms. The low-degree algorithm of Linial-Mansour-Nisan is able to get a constant factor approximation to the optimal error (again within an additive ϵ) and runs in time nO(s+log(k/ϵ)). The quasipolynomial dependence on 1/ϵ is inherent to the low-degree approach because the degree needs to grow as the target accuracy decreases, which is undesirable when ϵ is small. In contrast, as we will show, mixtures of k subcubes are uniquely determined by their 2 logk order moments and hence provide a useful abstraction for simultaneously achieving the polynomial dependence on 1/ϵ of the classic Occam algorithms for decision trees and the flexibility of the low-degree algorithm in being able to accommodate stochastic transitions. Using our multilinear moment techniques, we also give the first improved upper and lower bounds since the work of Feldman-O’Donnell-Servedio for the related but harder problem of learning mixtures of binary product distributions.
first_indexed	2024-09-23T13:43:01Z
format	Article
id	mit-1721.1/138050
institution	Massachusetts Institute of Technology
language	English
last_indexed	2024-09-23T13:43:01Z
publishDate	2021
publisher	ACM
record_format	dspace
spelling	mit-1721.1/1380502021-11-10T03:22:07Z Beyond the low-degree algorithm: mixtures of subcubes and their applications Chen, Sitan Moitra, Ankur © 2019 Association for Computing Machinery. We introduce the problem of learning mixtures of k subcubes over (0,1)n, which contains many classic learning theory problems as a special case (and is itself a special case of others). We give a surprising nO(log k)-time learning algorithm based on higher-order multilinear moments. It is not possible to learn the parameters because the same distribution can be represented by quite different models. Instead, we develop a framework for reasoning about how multilinear moments can pinpoint essential features of the mixture, like the number of components. We also give applications of our algorithm to learning decision trees with stochastic transitions (which also capture interesting scenarios where the transitions are deterministic but there are latent variables). Using our algorithm for learning mixtures of subcubes, we can approximate the Bayes optimal classifier within additive error ϵ on k-leaf decision trees with at most s stochastic transitions on any root-to-leaf path in nO(s+log k) · poly(1/ϵ) time. In this stochastic setting, the classic nO(log k) · poly(1/ϵ)-time algorithms of Rivest, Blum, and Ehrenfreucht-Haussler for learning decision trees with zero stochastic transitions break down because they are fundamentally Occam algorithms. The low-degree algorithm of Linial-Mansour-Nisan is able to get a constant factor approximation to the optimal error (again within an additive ϵ) and runs in time nO(s+log(k/ϵ)). The quasipolynomial dependence on 1/ϵ is inherent to the low-degree approach because the degree needs to grow as the target accuracy decreases, which is undesirable when ϵ is small. In contrast, as we will show, mixtures of k subcubes are uniquely determined by their 2 logk order moments and hence provide a useful abstraction for simultaneously achieving the polynomial dependence on 1/ϵ of the classic Occam algorithms for decision trees and the flexibility of the low-degree algorithm in being able to accommodate stochastic transitions. Using our multilinear moment techniques, we also give the first improved upper and lower bounds since the work of Feldman-O’Donnell-Servedio for the related but harder problem of learning mixtures of binary product distributions. 2021-11-09T19:19:08Z 2021-11-09T19:19:08Z 2019-06-23 2019-11-15T17:58:30Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/138050 Chen, Sitan and Moitra, Ankur. 2019. "Beyond the low-degree algorithm: mixtures of subcubes and their applications." en 10.1145/3313276.3316375 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf ACM arXiv
spellingShingle	Chen, Sitan Moitra, Ankur Beyond the low-degree algorithm: mixtures of subcubes and their applications
title	Beyond the low-degree algorithm: mixtures of subcubes and their applications
title_full	Beyond the low-degree algorithm: mixtures of subcubes and their applications
title_fullStr	Beyond the low-degree algorithm: mixtures of subcubes and their applications
title_full_unstemmed	Beyond the low-degree algorithm: mixtures of subcubes and their applications
title_short	Beyond the low-degree algorithm: mixtures of subcubes and their applications
title_sort	beyond the low degree algorithm mixtures of subcubes and their applications
url	https://hdl.handle.net/1721.1/138050
work_keys_str_mv	AT chensitan beyondthelowdegreealgorithmmixturesofsubcubesandtheirapplications AT moitraankur beyondthelowdegreealgorithmmixturesofsubcubesandtheirapplications

Beyond the low-degree algorithm: mixtures of subcubes and their applications

Similar Items