Application of Feature Selection Based on Multilayer GA in Stock Prediction

This paper proposes a feature selection model based on a multilayer genetic algorithm (GA) to select the features of a high stock dividend (HSD) and eliminate the relatively redundant features in the optimal solution by using layer-by-layer information transfer and two-dimensionality reduction metho...

Full description

Bibliographic Details
Main Authors: Xiaoning Li, Qiancheng Yu, Chen Tang, Zekun Lu, Yufan Yang
Format: Article
Language:English
Published: MDPI AG 2022-07-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/14/7/1415
_version_ 1797415393345667072
author Xiaoning Li
Qiancheng Yu
Chen Tang
Zekun Lu
Yufan Yang
author_facet Xiaoning Li
Qiancheng Yu
Chen Tang
Zekun Lu
Yufan Yang
author_sort Xiaoning Li
collection DOAJ
description This paper proposes a feature selection model based on a multilayer genetic algorithm (GA) to select the features of a high stock dividend (HSD) and eliminate the relatively redundant features in the optimal solution by using layer-by-layer information transfer and two-dimensionality reduction methods. Combining the ensemble model and time-series split cross-validation (TSCV) indicator as the fitness function solves the problem of selecting the fitness function for each layer. The symmetry character of the model is fully utilized in the two-dimensionality reduction processes, according to the change in data dimensions and the unbalanced characteristics of the HSD, setting the corresponding TSCV indicators. We built seven ensemble prediction models for actual stock trading data for comparison experiments. The results show that the feature selection model based on multilayer GA can effectively eliminate the relatively redundant features after dimensionality reduction and significantly improve the balancing accuracy, precision and AUC performance of the seven ensemble learning models. Finally, adversarial validation is used to analyze the differences in the balanced accuracy of the training and test sets caused by the inconsistent distribution of the data sets.
first_indexed 2024-03-09T05:46:58Z
format Article
id doaj.art-3f25095022b74395bf975f60856140e3
institution Directory Open Access Journal
issn 2073-8994
language English
last_indexed 2024-03-09T05:46:58Z
publishDate 2022-07-01
publisher MDPI AG
record_format Article
series Symmetry
spelling doaj.art-3f25095022b74395bf975f60856140e32023-12-03T12:19:55ZengMDPI AGSymmetry2073-89942022-07-01147141510.3390/sym14071415Application of Feature Selection Based on Multilayer GA in Stock PredictionXiaoning Li0Qiancheng Yu1Chen Tang2Zekun Lu3Yufan Yang4School of Computer Science and Engineering, North Minzu University, Yinchuan 750021, ChinaSchool of Computer Science and Engineering, North Minzu University, Yinchuan 750021, ChinaSchool of Computer Science and Engineering, North Minzu University, Yinchuan 750021, ChinaSchool of Computer Science and Engineering, North Minzu University, Yinchuan 750021, ChinaSchool of Computer Science and Engineering, North Minzu University, Yinchuan 750021, ChinaThis paper proposes a feature selection model based on a multilayer genetic algorithm (GA) to select the features of a high stock dividend (HSD) and eliminate the relatively redundant features in the optimal solution by using layer-by-layer information transfer and two-dimensionality reduction methods. Combining the ensemble model and time-series split cross-validation (TSCV) indicator as the fitness function solves the problem of selecting the fitness function for each layer. The symmetry character of the model is fully utilized in the two-dimensionality reduction processes, according to the change in data dimensions and the unbalanced characteristics of the HSD, setting the corresponding TSCV indicators. We built seven ensemble prediction models for actual stock trading data for comparison experiments. The results show that the feature selection model based on multilayer GA can effectively eliminate the relatively redundant features after dimensionality reduction and significantly improve the balancing accuracy, precision and AUC performance of the seven ensemble learning models. Finally, adversarial validation is used to analyze the differences in the balanced accuracy of the training and test sets caused by the inconsistent distribution of the data sets.https://www.mdpi.com/2073-8994/14/7/1415genetic algorithmtime series split cross validationfitness functionfeature selectionstock prediction
spellingShingle Xiaoning Li
Qiancheng Yu
Chen Tang
Zekun Lu
Yufan Yang
Application of Feature Selection Based on Multilayer GA in Stock Prediction
Symmetry
genetic algorithm
time series split cross validation
fitness function
feature selection
stock prediction
title Application of Feature Selection Based on Multilayer GA in Stock Prediction
title_full Application of Feature Selection Based on Multilayer GA in Stock Prediction
title_fullStr Application of Feature Selection Based on Multilayer GA in Stock Prediction
title_full_unstemmed Application of Feature Selection Based on Multilayer GA in Stock Prediction
title_short Application of Feature Selection Based on Multilayer GA in Stock Prediction
title_sort application of feature selection based on multilayer ga in stock prediction
topic genetic algorithm
time series split cross validation
fitness function
feature selection
stock prediction
url https://www.mdpi.com/2073-8994/14/7/1415
work_keys_str_mv AT xiaoningli applicationoffeatureselectionbasedonmultilayergainstockprediction
AT qianchengyu applicationoffeatureselectionbasedonmultilayergainstockprediction
AT chentang applicationoffeatureselectionbasedonmultilayergainstockprediction
AT zekunlu applicationoffeatureselectionbasedonmultilayergainstockprediction
AT yufanyang applicationoffeatureselectionbasedonmultilayergainstockprediction