Adaptive Regression Analysis of Heterogeneous Data Streams via Models with Dynamic Effects

Streaming data sequences arise from various areas in the era of big data, and it is challenging to explore efficient online models that adapt to them. To address the potential heterogeneity, we introduce a new online estimation procedure to analyze the constantly incoming streaming datasets. The und...

Full description

Bibliographic Details
Main Authors: Jianfeng Wei, Jian Yang, Xuewen Cheng, Jie Ding, Shengquan Li
Format: Article
Language:English
Published: MDPI AG 2023-12-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/11/24/4899
_version_ 1797380202768105472
author Jianfeng Wei
Jian Yang
Xuewen Cheng
Jie Ding
Shengquan Li
author_facet Jianfeng Wei
Jian Yang
Xuewen Cheng
Jie Ding
Shengquan Li
author_sort Jianfeng Wei
collection DOAJ
description Streaming data sequences arise from various areas in the era of big data, and it is challenging to explore efficient online models that adapt to them. To address the potential heterogeneity, we introduce a new online estimation procedure to analyze the constantly incoming streaming datasets. The underlying model structures are assumed to be the generalized linear models with dynamic regression coefficients. Our key idea lies in introducing a vector of unknown parameters to measure the differences between batch-specific regression coefficients from adjacent data blocks. This is followed by the usage of the adaptive lasso penalization methodology to accurately select nonzero components, which indicates the existence of dynamic coefficients. We provide detailed derivations to demonstrate how our proposed method not only fits within the online updating framework in which the old estimator is recursively replaced with a new one based solely on the current individual-level samples and historical summary statistics but also adaptively avoids undesirable estimation biases coming from the potential changes in model parameters of interest. Computational issues are also discussed in detail to facilitate implementation. Its practical performance is demonstrated through both extensive simulations and a real case study. In summary, we contribute to a novel online method that efficiently adapts to streaming data environment, addresses potential heterogeneity, and mitigates estimation biases from changes in coefficients.
first_indexed 2024-03-08T20:33:56Z
format Article
id doaj.art-81eac2361ff8407fa6722a3b8fde2c6c
institution Directory Open Access Journal
issn 2227-7390
language English
last_indexed 2024-03-08T20:33:56Z
publishDate 2023-12-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj.art-81eac2361ff8407fa6722a3b8fde2c6c2023-12-22T14:23:14ZengMDPI AGMathematics2227-73902023-12-011124489910.3390/math11244899Adaptive Regression Analysis of Heterogeneous Data Streams via Models with Dynamic EffectsJianfeng Wei0Jian Yang1Xuewen Cheng2Jie Ding3Shengquan Li4Peng Cheng Laboratory, Shenzhen 518066, ChinaPeng Cheng Laboratory, Shenzhen 518066, ChinaPeng Cheng Laboratory, Shenzhen 518066, ChinaSchool of Mathematical Sciences, Dalian University of Technology, Dalian 116024, ChinaPeng Cheng Laboratory, Shenzhen 518066, ChinaStreaming data sequences arise from various areas in the era of big data, and it is challenging to explore efficient online models that adapt to them. To address the potential heterogeneity, we introduce a new online estimation procedure to analyze the constantly incoming streaming datasets. The underlying model structures are assumed to be the generalized linear models with dynamic regression coefficients. Our key idea lies in introducing a vector of unknown parameters to measure the differences between batch-specific regression coefficients from adjacent data blocks. This is followed by the usage of the adaptive lasso penalization methodology to accurately select nonzero components, which indicates the existence of dynamic coefficients. We provide detailed derivations to demonstrate how our proposed method not only fits within the online updating framework in which the old estimator is recursively replaced with a new one based solely on the current individual-level samples and historical summary statistics but also adaptively avoids undesirable estimation biases coming from the potential changes in model parameters of interest. Computational issues are also discussed in detail to facilitate implementation. Its practical performance is demonstrated through both extensive simulations and a real case study. In summary, we contribute to a novel online method that efficiently adapts to streaming data environment, addresses potential heterogeneity, and mitigates estimation biases from changes in coefficients.https://www.mdpi.com/2227-7390/11/24/4899adaptive lassodata streamsdynamic coefficientsonline estimationregression analysis
spellingShingle Jianfeng Wei
Jian Yang
Xuewen Cheng
Jie Ding
Shengquan Li
Adaptive Regression Analysis of Heterogeneous Data Streams via Models with Dynamic Effects
Mathematics
adaptive lasso
data streams
dynamic coefficients
online estimation
regression analysis
title Adaptive Regression Analysis of Heterogeneous Data Streams via Models with Dynamic Effects
title_full Adaptive Regression Analysis of Heterogeneous Data Streams via Models with Dynamic Effects
title_fullStr Adaptive Regression Analysis of Heterogeneous Data Streams via Models with Dynamic Effects
title_full_unstemmed Adaptive Regression Analysis of Heterogeneous Data Streams via Models with Dynamic Effects
title_short Adaptive Regression Analysis of Heterogeneous Data Streams via Models with Dynamic Effects
title_sort adaptive regression analysis of heterogeneous data streams via models with dynamic effects
topic adaptive lasso
data streams
dynamic coefficients
online estimation
regression analysis
url https://www.mdpi.com/2227-7390/11/24/4899
work_keys_str_mv AT jianfengwei adaptiveregressionanalysisofheterogeneousdatastreamsviamodelswithdynamiceffects
AT jianyang adaptiveregressionanalysisofheterogeneousdatastreamsviamodelswithdynamiceffects
AT xuewencheng adaptiveregressionanalysisofheterogeneousdatastreamsviamodelswithdynamiceffects
AT jieding adaptiveregressionanalysisofheterogeneousdatastreamsviamodelswithdynamiceffects
AT shengquanli adaptiveregressionanalysisofheterogeneousdatastreamsviamodelswithdynamiceffects