A lava attack on the recovery of sums of dense and sparse signals

Common high-dimensional methods for prediction rely on having either a sparse signal model, a model in which most parameters are zero and there are a small number of nonzero parameters that are large in magnitude, or a dense signal model, a model with no large parameters and very many small nonzero...

Full description

Bibliographic Details
Main Authors:	Liao, Yuan, Chernozhukov, Victor V, Hansen, Christian B.
Other Authors:	Massachusetts Institute of Technology. Department of Economics
Format:	Article
Published:	Institute of Mathematical Statistics 2018
Online Access:	http://hdl.handle.net/1721.1/113848 https://orcid.org/0000-0002-3250-6714

_version_	1811089361097195520
author	Liao, Yuan Chernozhukov, Victor V Hansen, Christian B.
author2	Massachusetts Institute of Technology. Department of Economics
author_facet	Massachusetts Institute of Technology. Department of Economics Liao, Yuan Chernozhukov, Victor V Hansen, Christian B.
author_sort	Liao, Yuan
collection	MIT
description	Common high-dimensional methods for prediction rely on having either a sparse signal model, a model in which most parameters are zero and there are a small number of nonzero parameters that are large in magnitude, or a dense signal model, a model with no large parameters and very many small nonzero parameters. We consider a generalization of these two basic models, termed here a "sparse + dense" model, in which the signal is given by the sum of a sparse signal and a dense signal. Such a structure poses problems for traditional sparse estimators, such as the lasso, and for traditional dense estimation methods, such as ridge estimation. We propose a new penalization-based method, called lava, which is computationally efficient. With suitable choices of penalty parameters, the proposed method strictly dominates both lasso and ridge. We derive analytic expressions for the finite-sample risk function of the lava estimator in the Gaussian sequence model. We also provide a deviation bound for the prediction risk in the Gaussian regression model with fixed design. In both cases, we provide Stein's unbiased estimator for lava's prediction risk. A simulation example compares the performance of lava to lasso, ridge and elastic net in a regression example using data-dependent penalty parameters and illustrates lava's improved performance relative to these benchmarks.
first_indexed	2024-09-23T14:17:57Z
format	Article
id	mit-1721.1/113848
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T14:17:57Z
publishDate	2018
publisher	Institute of Mathematical Statistics
record_format	dspace
spelling	mit-1721.1/1138482022-10-01T20:28:17Z A lava attack on the recovery of sums of dense and sparse signals Liao, Yuan Chernozhukov, Victor V Hansen, Christian B. Massachusetts Institute of Technology. Department of Economics Chernozhukov, Victor V Hansen, Christian B. Common high-dimensional methods for prediction rely on having either a sparse signal model, a model in which most parameters are zero and there are a small number of nonzero parameters that are large in magnitude, or a dense signal model, a model with no large parameters and very many small nonzero parameters. We consider a generalization of these two basic models, termed here a "sparse + dense" model, in which the signal is given by the sum of a sparse signal and a dense signal. Such a structure poses problems for traditional sparse estimators, such as the lasso, and for traditional dense estimation methods, such as ridge estimation. We propose a new penalization-based method, called lava, which is computationally efficient. With suitable choices of penalty parameters, the proposed method strictly dominates both lasso and ridge. We derive analytic expressions for the finite-sample risk function of the lava estimator in the Gaussian sequence model. We also provide a deviation bound for the prediction risk in the Gaussian regression model with fixed design. In both cases, we provide Stein's unbiased estimator for lava's prediction risk. A simulation example compares the performance of lava to lasso, ridge and elastic net in a regression example using data-dependent penalty parameters and illustrates lava's improved performance relative to these benchmarks. 2018-02-21T16:08:02Z 2018-02-21T16:08:02Z 2017-02 2015-12 2018-02-20T17:47:45Z Article http://purl.org/eprint/type/JournalArticle 0090-5364 http://hdl.handle.net/1721.1/113848 Chernozhukov, Victor et al. “A Lava Attack on the Recovery of Sums of Dense and Sparse Signals.” The Annals of Statistics 45, 1 (February 2017): 39–76 https://orcid.org/0000-0002-3250-6714 http://dx.doi.org/10.1214/16-AOS1434 The Annals of Statistics Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Institute of Mathematical Statistics arXiv
spellingShingle	Liao, Yuan Chernozhukov, Victor V Hansen, Christian B. A lava attack on the recovery of sums of dense and sparse signals
title	A lava attack on the recovery of sums of dense and sparse signals
title_full	A lava attack on the recovery of sums of dense and sparse signals
title_fullStr	A lava attack on the recovery of sums of dense and sparse signals
title_full_unstemmed	A lava attack on the recovery of sums of dense and sparse signals
title_short	A lava attack on the recovery of sums of dense and sparse signals
title_sort	lava attack on the recovery of sums of dense and sparse signals
url	http://hdl.handle.net/1721.1/113848 https://orcid.org/0000-0002-3250-6714
work_keys_str_mv	AT liaoyuan alavaattackontherecoveryofsumsofdenseandsparsesignals AT chernozhukovvictorv alavaattackontherecoveryofsumsofdenseandsparsesignals AT hansenchristianb alavaattackontherecoveryofsumsofdenseandsparsesignals AT liaoyuan lavaattackontherecoveryofsumsofdenseandsparsesignals AT chernozhukovvictorv lavaattackontherecoveryofsumsofdenseandsparsesignals AT hansenchristianb lavaattackontherecoveryofsumsofdenseandsparsesignals

A lava attack on the recovery of sums of dense and sparse signals

Similar Items