Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?

For the linear model <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>Y</mi><mo>=</mo><mi>X</mi><mi>b</mi><mo>+</mo><mi>e</mi><...

Full description

Bibliographic Details
Main Author: Rajaram Gana
Format: Article
Language:English
Published: MDPI AG 2022-08-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/10/17/3057
_version_ 1797494377524756480
author Rajaram Gana
author_facet Rajaram Gana
author_sort Rajaram Gana
collection DOAJ
description For the linear model <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>Y</mi><mo>=</mo><mi>X</mi><mi>b</mi><mo>+</mo><mi>e</mi><mi>r</mi><mi>r</mi><mi>o</mi><mi>r</mi></mrow></semantics></math></inline-formula>, where the number of regressors (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>p</mi></semantics></math></inline-formula>) exceeds the number of observations (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>n</mi></semantics></math></inline-formula>), the Elastic Net (EN) was proposed, in 2005, to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula>. The EN uses <i>both</i> the Lasso, proposed in 1996, and ordinary Ridge Regression (RR), proposed in 1970, to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula>. However, when <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>p</mi><mo>></mo><mi>n</mi></mrow></semantics></math></inline-formula>, using <i>only</i> RR to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula> has not been considered in the literature thus far. Because RR is based on the least-squares framework, only using RR to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula> is computationally much simpler than using the EN. We propose a generalized ridge regression (GRR) algorithm, a superior alternative to the EN, for estimating <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula> as follows: partition <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>X</mi></semantics></math></inline-formula> from left to right so that every partition, but the last one, has 3 observations per regressor; for each partition, we estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>Y</mi></semantics></math></inline-formula> with the regressors in that partition using ordinary RR; retain the regressors with statistically significant <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>t</mi></semantics></math></inline-formula>-ratios and the corresponding RR tuning parameter <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>k</mi></semantics></math></inline-formula>, by partition; use the retained regressors and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>k</mi></semantics></math></inline-formula> values to re-estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>Y</mi></semantics></math></inline-formula> by GRR across all partitions, which yields <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula>. Algorithmic efficacy is compared using 4 metrics by simulation, because the algorithm is mathematically intractable. Three metrics, with their probabilities of RR’s superiority over EN in parentheses, are: the proportion of true regressors discovered (99%); the squared distance, from the true coefficients, of the significant coefficients (86%); and the squared distance, from the true coefficients, of estimated coefficients that are both significant and true (74%). The fourth metric is the probability that none of the regressors discovered are true, which for RR and EN is 4% and 25%, respectively. This indicates the additional advantage RR has over the EN in terms of discovering causal regressors.
first_indexed 2024-03-10T01:33:24Z
format Article
id doaj.art-3cc162982a3c4ce78f55cf061dd72d9d
institution Directory Open Access Journal
issn 2227-7390
language English
last_indexed 2024-03-10T01:33:24Z
publishDate 2022-08-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj.art-3cc162982a3c4ce78f55cf061dd72d9d2023-11-23T13:37:36ZengMDPI AGMathematics2227-73902022-08-011017305710.3390/math10173057Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?Rajaram Gana0Department of Biochemistry and Molecular & Cellular Biology, School of Medicine, Georgetown University, Washington, DC 20057, USAFor the linear model <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>Y</mi><mo>=</mo><mi>X</mi><mi>b</mi><mo>+</mo><mi>e</mi><mi>r</mi><mi>r</mi><mi>o</mi><mi>r</mi></mrow></semantics></math></inline-formula>, where the number of regressors (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>p</mi></semantics></math></inline-formula>) exceeds the number of observations (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>n</mi></semantics></math></inline-formula>), the Elastic Net (EN) was proposed, in 2005, to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula>. The EN uses <i>both</i> the Lasso, proposed in 1996, and ordinary Ridge Regression (RR), proposed in 1970, to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula>. However, when <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>p</mi><mo>></mo><mi>n</mi></mrow></semantics></math></inline-formula>, using <i>only</i> RR to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula> has not been considered in the literature thus far. Because RR is based on the least-squares framework, only using RR to estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula> is computationally much simpler than using the EN. We propose a generalized ridge regression (GRR) algorithm, a superior alternative to the EN, for estimating <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula> as follows: partition <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>X</mi></semantics></math></inline-formula> from left to right so that every partition, but the last one, has 3 observations per regressor; for each partition, we estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>Y</mi></semantics></math></inline-formula> with the regressors in that partition using ordinary RR; retain the regressors with statistically significant <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>t</mi></semantics></math></inline-formula>-ratios and the corresponding RR tuning parameter <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>k</mi></semantics></math></inline-formula>, by partition; use the retained regressors and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>k</mi></semantics></math></inline-formula> values to re-estimate <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>Y</mi></semantics></math></inline-formula> by GRR across all partitions, which yields <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>b</mi></semantics></math></inline-formula>. Algorithmic efficacy is compared using 4 metrics by simulation, because the algorithm is mathematically intractable. Three metrics, with their probabilities of RR’s superiority over EN in parentheses, are: the proportion of true regressors discovered (99%); the squared distance, from the true coefficients, of the significant coefficients (86%); and the squared distance, from the true coefficients, of estimated coefficients that are both significant and true (74%). The fourth metric is the probability that none of the regressors discovered are true, which for RR and EN is 4% and 25%, respectively. This indicates the additional advantage RR has over the EN in terms of discovering causal regressors.https://www.mdpi.com/2227-7390/10/17/3057elastic netgeneralized ridge regressionordinary ridge regressionstatistical significance
spellingShingle Rajaram Gana
Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?
Mathematics
elastic net
generalized ridge regression
ordinary ridge regression
statistical significance
title Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?
title_full Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?
title_fullStr Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?
title_full_unstemmed Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?
title_short Ridge Regression and the Elastic Net: How Do They Do as Finders of True Regressors and Their Coefficients?
title_sort ridge regression and the elastic net how do they do as finders of true regressors and their coefficients
topic elastic net
generalized ridge regression
ordinary ridge regression
statistical significance
url https://www.mdpi.com/2227-7390/10/17/3057
work_keys_str_mv AT rajaramgana ridgeregressionandtheelasticnethowdotheydoasfindersoftrueregressorsandtheircoefficients