Incorporating Multiple Textual Factors into Unbalanced Financial Distress Prediction: A Feature Selection Methods and Ensemble Classifiers Combined Approach
Abstract Textual-based factors have been widely regarded as a promising feature that can be applied to financial issues. This study focuses on extracting both basic and semantic textual features to supplement the traditionally used financial indicators. The main is to improve Chinese listed companie...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2023-10-01
|
Series: | International Journal of Computational Intelligence Systems |
Subjects: | |
Online Access: | https://doi.org/10.1007/s44196-023-00342-2 |
_version_ | 1797556538388250624 |
---|---|
author | Shixuan Li Wenxuan Shi |
author_facet | Shixuan Li Wenxuan Shi |
author_sort | Shixuan Li |
collection | DOAJ |
description | Abstract Textual-based factors have been widely regarded as a promising feature that can be applied to financial issues. This study focuses on extracting both basic and semantic textual features to supplement the traditionally used financial indicators. The main is to improve Chinese listed companies’ financial distress prediction (FDP). A unique paradigm is proposed in this study that combines financial and multi-type textual predictive factors, feature selection methods, classifiers, and time spans to achieve the optimal FDP. The frequency counts, TF-IDF, TextRank, and word embedding approaches are employed to extract frequency count-based, keyword-based, sentiment, and readability indicators. The experimental results prove that financial domain sentiment lexicons, word embedding-based readability analysis approaches, and the basic textual features of Management Discussion and Analysis can be important elements of FDP. Moreover, the finding highlights the fact that incorporating financial and textual features can achieve optimal performance 4 or 5 years before the expected baseline year; applying the RF-GBDT combined model can also outperform other classifiers. This study makes an innovative contribution, since it expands the multiple text analysis method in the financial text mining field and provides new findings on how to provide early warning signs related to financial risk. The approaches developed in this research can serve as a template that can be used to resolve other financial issues. |
first_indexed | 2024-03-10T17:04:20Z |
format | Article |
id | doaj.art-d8f5af1ce5b44310b6c2e4ffc17e59dd |
institution | Directory Open Access Journal |
issn | 1875-6883 |
language | English |
last_indexed | 2024-03-10T17:04:20Z |
publishDate | 2023-10-01 |
publisher | Springer |
record_format | Article |
series | International Journal of Computational Intelligence Systems |
spelling | doaj.art-d8f5af1ce5b44310b6c2e4ffc17e59dd2023-11-20T10:52:31ZengSpringerInternational Journal of Computational Intelligence Systems1875-68832023-10-0116112410.1007/s44196-023-00342-2Incorporating Multiple Textual Factors into Unbalanced Financial Distress Prediction: A Feature Selection Methods and Ensemble Classifiers Combined ApproachShixuan Li0Wenxuan Shi1School of Safety Science and Emergency Management, Wuhan University of TechnologySchool of Information Management, Wuhan UniversityAbstract Textual-based factors have been widely regarded as a promising feature that can be applied to financial issues. This study focuses on extracting both basic and semantic textual features to supplement the traditionally used financial indicators. The main is to improve Chinese listed companies’ financial distress prediction (FDP). A unique paradigm is proposed in this study that combines financial and multi-type textual predictive factors, feature selection methods, classifiers, and time spans to achieve the optimal FDP. The frequency counts, TF-IDF, TextRank, and word embedding approaches are employed to extract frequency count-based, keyword-based, sentiment, and readability indicators. The experimental results prove that financial domain sentiment lexicons, word embedding-based readability analysis approaches, and the basic textual features of Management Discussion and Analysis can be important elements of FDP. Moreover, the finding highlights the fact that incorporating financial and textual features can achieve optimal performance 4 or 5 years before the expected baseline year; applying the RF-GBDT combined model can also outperform other classifiers. This study makes an innovative contribution, since it expands the multiple text analysis method in the financial text mining field and provides new findings on how to provide early warning signs related to financial risk. The approaches developed in this research can serve as a template that can be used to resolve other financial issues.https://doi.org/10.1007/s44196-023-00342-2Textual factorsFeature selectionEnsemble classifiersFinancial distress predictionWord embedding |
spellingShingle | Shixuan Li Wenxuan Shi Incorporating Multiple Textual Factors into Unbalanced Financial Distress Prediction: A Feature Selection Methods and Ensemble Classifiers Combined Approach International Journal of Computational Intelligence Systems Textual factors Feature selection Ensemble classifiers Financial distress prediction Word embedding |
title | Incorporating Multiple Textual Factors into Unbalanced Financial Distress Prediction: A Feature Selection Methods and Ensemble Classifiers Combined Approach |
title_full | Incorporating Multiple Textual Factors into Unbalanced Financial Distress Prediction: A Feature Selection Methods and Ensemble Classifiers Combined Approach |
title_fullStr | Incorporating Multiple Textual Factors into Unbalanced Financial Distress Prediction: A Feature Selection Methods and Ensemble Classifiers Combined Approach |
title_full_unstemmed | Incorporating Multiple Textual Factors into Unbalanced Financial Distress Prediction: A Feature Selection Methods and Ensemble Classifiers Combined Approach |
title_short | Incorporating Multiple Textual Factors into Unbalanced Financial Distress Prediction: A Feature Selection Methods and Ensemble Classifiers Combined Approach |
title_sort | incorporating multiple textual factors into unbalanced financial distress prediction a feature selection methods and ensemble classifiers combined approach |
topic | Textual factors Feature selection Ensemble classifiers Financial distress prediction Word embedding |
url | https://doi.org/10.1007/s44196-023-00342-2 |
work_keys_str_mv | AT shixuanli incorporatingmultipletextualfactorsintounbalancedfinancialdistresspredictionafeatureselectionmethodsandensembleclassifierscombinedapproach AT wenxuanshi incorporatingmultipletextualfactorsintounbalancedfinancialdistresspredictionafeatureselectionmethodsandensembleclassifierscombinedapproach |