National student loans default risk prediction: A heterogeneous ensemble learning approach and the SHAP method

National student loans are crucial for ensuring that economically disadvantaged students are able to complete their education successfully, however, the high default rate and excessive demand associated with these loans pose significant risks to various stakeholders. Students' repayment behavio...

Full description

Bibliographic Details
Main Authors: Yuan Wang, Yanbo Zhang, Mengkun Liang, Ruixue Yuan, Jie Feng, Jun Wu
Format: Article
Language:English
Published: Elsevier 2023-01-01
Series:Computers and Education: Artificial Intelligence
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666920X23000450
Description
Summary:National student loans are crucial for ensuring that economically disadvantaged students are able to complete their education successfully, however, the high default rate and excessive demand associated with these loans pose significant risks to various stakeholders. Students' repayment behavior can have adverse impacts on the state, banks, universities, and themselves. Despite the importance of this issue, there has been a lack of empirical research on national student loan default data using machine learning methods, and it has ignored the impact that student growth processes may have on default behavior. In this study, we addressed this research gap by integrating multiple heterogeneous machine learning models through performance analysis to improve prediction accuracy. Furthermore, we utilized the SHAP interpretable method to examine the relationship between students' growth process and loan default behavior in greater depth. The results of this study confirmed that the total amount of scholarships received by students during their school years has a significant impact on the occurrence of loan default. Specifically, the more scholarships students receive, the less likely they are to default. Additionally, we found that students' performance during school, as reflected by their GPA, is a significant predictor of default behavior. Surprisingly, we also found that college entrance examination scores can influence the risk of loan default to a certain extent. This study provides valuable insights for schools regarding students who apply for national student loans and establishes a bridge for college administrators to predict students' default behavior based on campus “big data”. Based on the study results, college administrators can design “precise” education programs during students' study period to effectively reduce the risk of national student loan default.
ISSN:2666-920X