Sign-constrained least squares estimation for high-dimensional regression

Many regularization schemes for high-dimensional regression have been put forward. Most require the choice of a tuning parameter, using model selection criteria or cross-validation schemes. We show that a simple non-negative or sign-constrained least squares is a very simple and effective regulariza...

Full description

Bibliographic Details
Main Author: Meinshausen, N
Format: Journal article
Language:English
Published: 2012
_version_ 1797086362810187776
author Meinshausen, N
author_facet Meinshausen, N
author_sort Meinshausen, N
collection OXFORD
description Many regularization schemes for high-dimensional regression have been put forward. Most require the choice of a tuning parameter, using model selection criteria or cross-validation schemes. We show that a simple non-negative or sign-constrained least squares is a very simple and effective regularization technique for a certain class of high-dimensional regression problems. The sign constraint has to be derived via prior knowledge or an initial estimator but no further tuning or cross-validation is necessary. The success depends on conditions that are easy to check in practice. A sufficient condition for our results is that most variables with the same sign constraint are positively correlated. For a sparse optimal predictor, a non-asymptotic bound on the L1-error of the regression coefficients is then proven. Without using any further regularization, the regression vector can be estimated consistently as long as \log(p) s/n -> 0 for n -> \infty, where s is the sparsity of the optimal regression vector, p the number of variables and n sample size. Network tomography is shown to be an application where the necessary conditions for success of non-negative least squares are naturally fulfilled and empirical results confirm the effectiveness of the sign constraint for sparse recovery.
first_indexed 2024-03-07T02:20:57Z
format Journal article
id oxford-uuid:a3e43016-f101-469b-be37-2cb0e4329024
institution University of Oxford
language English
last_indexed 2024-03-07T02:20:57Z
publishDate 2012
record_format dspace
spelling oxford-uuid:a3e43016-f101-469b-be37-2cb0e43290242022-03-27T02:30:12ZSign-constrained least squares estimation for high-dimensional regressionJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:a3e43016-f101-469b-be37-2cb0e4329024EnglishSymplectic Elements at Oxford2012Meinshausen, NMany regularization schemes for high-dimensional regression have been put forward. Most require the choice of a tuning parameter, using model selection criteria or cross-validation schemes. We show that a simple non-negative or sign-constrained least squares is a very simple and effective regularization technique for a certain class of high-dimensional regression problems. The sign constraint has to be derived via prior knowledge or an initial estimator but no further tuning or cross-validation is necessary. The success depends on conditions that are easy to check in practice. A sufficient condition for our results is that most variables with the same sign constraint are positively correlated. For a sparse optimal predictor, a non-asymptotic bound on the L1-error of the regression coefficients is then proven. Without using any further regularization, the regression vector can be estimated consistently as long as \log(p) s/n -> 0 for n -> \infty, where s is the sparsity of the optimal regression vector, p the number of variables and n sample size. Network tomography is shown to be an application where the necessary conditions for success of non-negative least squares are naturally fulfilled and empirical results confirm the effectiveness of the sign constraint for sparse recovery.
spellingShingle Meinshausen, N
Sign-constrained least squares estimation for high-dimensional regression
title Sign-constrained least squares estimation for high-dimensional regression
title_full Sign-constrained least squares estimation for high-dimensional regression
title_fullStr Sign-constrained least squares estimation for high-dimensional regression
title_full_unstemmed Sign-constrained least squares estimation for high-dimensional regression
title_short Sign-constrained least squares estimation for high-dimensional regression
title_sort sign constrained least squares estimation for high dimensional regression
work_keys_str_mv AT meinshausenn signconstrainedleastsquaresestimationforhighdimensionalregression