Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?

For the linear model <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>Y</mi><mo>=</mo><mi>X</mi><mi>b</mi><mo>+</mo><mi>e</mi><...

Full description

Bibliographic Details
Main Author:	Rajaram Gana
Format:	Article
Language:	English
Published:	MDPI AG 2022-08-01
Series:	Mathematics
Subjects:	elastic net generalized ridge regression ordinary ridge regression statistical significance
Online Access:	https://www.mdpi.com/2227-7390/10/17/3057

_version_	1797494377524756480
author	Rajaram Gana
author_facet	Rajaram Gana
author_sort	Rajaram Gana
collection	DOAJ
description	For the linear model <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>Y</mi><mo>=</mo><mi>X</mi><mi>b</mi><mo>+</mo><mi>e</mi><mi>r</mi><mi>r</mi><mi>o</mi><mi>r</mi></mrow></semantics></math></inline-formula>, where the number of regressors (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>p</mi></semantics></math></inline-formula>) exceeds the number of observations (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>n</mi></semantics></math></inline-formula>), the Elastic Net (EN) was proposed, in 2005, to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula>. The EN uses <i>both</i> the Lasso, proposed in 1996, and ordinary Ridge Regression (RR), proposed in 1970, to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula>. However, when <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>p</mi><mo>></mo><mi>n</mi></mrow></semantics></math></inline-formula>, using <i>only</i> RR to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula> has not been considered in the literature thus far. Because RR is based on the least-squares framework, only using RR to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula> is computationally much simpler than using the EN. We propose a generalized ridge regression (GRR) algorithm, a superior alternative to the EN, for estimating <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula> as follows: partition <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>X</mi></semantics></math></inline-formula> from left to right so that every partition, but the last one, has 3 observations per regressor; for each partition, we estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>Y</mi></semantics></math></inline-formula> with the regressors in that partition using ordinary RR; retain the regressors with statistically significant <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>t</mi></semantics></math></inline-formula>-ratios and the corresponding RR tuning parameter <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>k</mi></semantics></math></inline-formula>, by partition; use the retained regressors and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>k</mi></semantics></math></inline-formula> values to re-estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>Y</mi></semantics></math></inline-formula> by GRR across all partitions, which yields <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula>. Algorithmic efficacy is compared using 4 metrics by simulation, because the algorithm is mathematically intractable. Three metrics, with their probabilities of RR’s superiority over EN in parentheses, are: the proportion of true regressors discovered (99%); the squared distance, from the true coefficients, of the significant coefficients (86%); and the squared distance, from the true coefficients, of estimated coefficients that are both significant and true (74%). The fourth metric is the probability that none of the regressors discovered are true, which for RR and EN is 4% and 25%, respectively. This indicates the additional advantage RR has over the EN in terms of discovering causal regressors.
first_indexed	2024-03-10T01:33:24Z
format	Article
id	doaj.art-3cc162982a3c4ce78f55cf061dd72d9d
institution	Directory Open Access Journal
issn	2227-7390
language	English
last_indexed	2024-03-10T01:33:24Z
publishDate	2022-08-01
publisher	MDPI AG
record_format	Article
series	Mathematics
spelling	doaj.art-3cc162982a3c4ce78f55cf061dd72d9d2023-11-23T13:37:36ZengMDPI AGMathematics2227-73902022-08-011017305710.3390/math10173057Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?Rajaram Gana0Department of Biochemistry and Molecular & Cellular Biology, School of Medicine, Georgetown University, Washington, DC 20057, USAFor the linear model <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>Y</mi><mo>=</mo><mi>X</mi><mi>b</mi><mo>+</mo><mi>e</mi><mi>r</mi><mi>r</mi><mi>o</mi><mi>r</mi></mrow></semantics></math></inline-formula>, where the number of regressors (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>p</mi></semantics></math></inline-formula>) exceeds the number of observations (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>n</mi></semantics></math></inline-formula>), the Elastic Net (EN) was proposed, in 2005, to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula>. The EN uses <i>both</i> the Lasso, proposed in 1996, and ordinary Ridge Regression (RR), proposed in 1970, to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula>. However, when <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>p</mi><mo>></mo><mi>n</mi></mrow></semantics></math></inline-formula>, using <i>only</i> RR to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula> has not been considered in the literature thus far. Because RR is based on the least-squares framework, only using RR to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula> is computationally much simpler than using the EN. We propose a generalized ridge regression (GRR) algorithm, a superior alternative to the EN, for estimating <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula> as follows: partition <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>X</mi></semantics></math></inline-formula> from left to right so that every partition, but the last one, has 3 observations per regressor; for each partition, we estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>Y</mi></semantics></math></inline-formula> with the regressors in that partition using ordinary RR; retain the regressors with statistically significant <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>t</mi></semantics></math></inline-formula>-ratios and the corresponding RR tuning parameter <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>k</mi></semantics></math></inline-formula>, by partition; use the retained regressors and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>k</mi></semantics></math></inline-formula> values to re-estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>Y</mi></semantics></math></inline-formula> by GRR across all partitions, which yields <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula>. Algorithmic efficacy is compared using 4 metrics by simulation, because the algorithm is mathematically intractable. Three metrics, with their probabilities of RR’s superiority over EN in parentheses, are: the proportion of true regressors discovered (99%); the squared distance, from the true coefficients, of the significant coefficients (86%); and the squared distance, from the true coefficients, of estimated coefficients that are both significant and true (74%). The fourth metric is the probability that none of the regressors discovered are true, which for RR and EN is 4% and 25%, respectively. This indicates the additional advantage RR has over the EN in terms of discovering causal regressors.https://www.mdpi.com/2227-7390/10/17/3057elastic netgeneralized ridge regressionordinary ridge regressionstatistical significance
spellingShingle	Rajaram Gana Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients? Mathematics elastic net generalized ridge regression ordinary ridge regression statistical significance
title	Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?
title_full	Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?
title_fullStr	Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?
title_full_unstemmed	Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?
title_short	Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?
title_sort	ridge regression and the elastic net how do they do as finders of true regressors and their coefficients
topic	elastic net generalized ridge regression ordinary ridge regression statistical significance
url	https://www.mdpi.com/2227-7390/10/17/3057
work_keys_str_mv	AT rajaramgana ridgeregressionandtheelasticnethowdotheydoasfindersoftrueregressorsandtheircoefficients

Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?

Similar Items