Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method

We give a new approach to the dictionary learning (also known as “sparse coding”) problem of recovering an unknown n × m matrix A (for m ≥ n) from examples of the form [y = Ax + e], where x is a random vector in R[superscript m] with at most τ m nonzero coordinates, and e is a random noise vector in...

Full description

Bibliographic Details
Main Authors:	Barak, Boaz, Steurer, David, Kelner, Jonathan Adam
Other Authors:	Massachusetts Institute of Technology. Department of Mathematics
Format:	Article
Language:	en_US
Published:	Association for Computing Machinery 2016
Online Access:	http://hdl.handle.net/1721.1/105133 https://orcid.org/0000-0002-4257-4198

_version_	1826199727406841856
author	Barak, Boaz Steurer, David Kelner, Jonathan Adam
author2	Massachusetts Institute of Technology. Department of Mathematics
author_facet	Massachusetts Institute of Technology. Department of Mathematics Barak, Boaz Steurer, David Kelner, Jonathan Adam
author_sort	Barak, Boaz
collection	MIT
description	We give a new approach to the dictionary learning (also known as “sparse coding”) problem of recovering an unknown n × m matrix A (for m ≥ n) from examples of the form [y = Ax + e], where x is a random vector in R[superscript m] with at most τ m nonzero coordinates, and e is a random noise vector in R[superscript n] with bounded magnitude. For the case m = O(n), our algorithm recovers every column of A within arbitrarily good constant accuracy in time m[superscript O(log m/log(τ[superscript −1]))], in particular achieving polynomial time if τ = m[superscript −δ] for any δ > 0, and time m[superscript O(log m)] if τ is (a sufficiently small) constant. Prior algorithms with comparable assumptions on the distribution required the vector x to be much sparser—at most √n nonzero coordinates—and there were intrinsic barriers preventing these algorithms from applying for denser x. We achieve this by designing an algorithm for noisy tensor decomposition that can recover, under quite general conditions, an approximate rank-one decomposition of a tensor T, given access to a tensor T[supserscript ′] that is τ-close to T in the spectral norm (when considered as a matrix). To our knowledge, this is the first algorithm for tensor decomposition that works in the constant spectral-norm noise regime, where there is no guarantee that the local optima of T and T[superscript ′] have similar structures. Our algorithm is based on a novel approach to using and analyzing the Sum of Squares semidefinite programming hierarchy (Parrilo 2000, Lasserre 2001), and it can be viewed as an indication of the utility of this very general and powerful tool for unsupervised learning problems.
first_indexed	2024-09-23T11:24:48Z
format	Article
id	mit-1721.1/105133
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T11:24:48Z
publishDate	2016
publisher	Association for Computing Machinery
record_format	dspace
spelling	mit-1721.1/1051332022-09-27T19:24:44Z Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method Barak, Boaz Steurer, David Kelner, Jonathan Adam Massachusetts Institute of Technology. Department of Mathematics Kelner, Jonathan Adam We give a new approach to the dictionary learning (also known as “sparse coding”) problem of recovering an unknown n × m matrix A (for m ≥ n) from examples of the form [y = Ax + e], where x is a random vector in R[superscript m] with at most τ m nonzero coordinates, and e is a random noise vector in R[superscript n] with bounded magnitude. For the case m = O(n), our algorithm recovers every column of A within arbitrarily good constant accuracy in time m[superscript O(log m/log(τ[superscript −1]))], in particular achieving polynomial time if τ = m[superscript −δ] for any δ > 0, and time m[superscript O(log m)] if τ is (a sufficiently small) constant. Prior algorithms with comparable assumptions on the distribution required the vector x to be much sparser—at most √n nonzero coordinates—and there were intrinsic barriers preventing these algorithms from applying for denser x. We achieve this by designing an algorithm for noisy tensor decomposition that can recover, under quite general conditions, an approximate rank-one decomposition of a tensor T, given access to a tensor T[supserscript ′] that is τ-close to T in the spectral norm (when considered as a matrix). To our knowledge, this is the first algorithm for tensor decomposition that works in the constant spectral-norm noise regime, where there is no guarantee that the local optima of T and T[superscript ′] have similar structures. Our algorithm is based on a novel approach to using and analyzing the Sum of Squares semidefinite programming hierarchy (Parrilo 2000, Lasserre 2001), and it can be viewed as an indication of the utility of this very general and powerful tool for unsupervised learning problems. National Science Foundation (U.S.) (grant 1111109) 2016-10-28T16:40:59Z 2016-10-28T16:40:59Z 2015-06 Article http://purl.org/eprint/type/JournalArticle 9781450335362 http://hdl.handle.net/1721.1/105133 Barak, Boaz, Jonathan A. Kelner, and David Steurer. “Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method.” ACM Press, 2015. 143–151. https://orcid.org/0000-0002-4257-4198 en_US http://dx.doi.org/10.1145/2746539.2746605 Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing - STOC '15 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Association for Computing Machinery arXiv
spellingShingle	Barak, Boaz Steurer, David Kelner, Jonathan Adam Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method
title	Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method
title_full	Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method
title_fullStr	Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method
title_full_unstemmed	Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method
title_short	Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method
title_sort	dictionary learning and tensor decomposition via the sum of squares method
url	http://hdl.handle.net/1721.1/105133 https://orcid.org/0000-0002-4257-4198
work_keys_str_mv	AT barakboaz dictionarylearningandtensordecompositionviathesumofsquaresmethod AT steurerdavid dictionarylearningandtensordecompositionviathesumofsquaresmethod AT kelnerjonathanadam dictionarylearningandtensordecompositionviathesumofsquaresmethod

Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method

Similar Items