Machine Learning for Credit Risk Prediction: A Systematic Literature Review

In this systematic review of the literature on using Machine Learning (ML) for credit risk prediction, we raise the need for financial institutions to use Artificial Intelligence (AI) and ML to assess credit risk, analyzing large volumes of information. We posed research questions about algorithms,...

Full description

Bibliographic Details
Main Authors:	Jomark Pablo Noriega, Luis Antonio Rivera, José Alfredo Herrera
Format:	Article
Language:	English
Published:	MDPI AG 2023-11-01
Series:	Data
Subjects:	loan credit risk prediction machine learning systematic literature review
Online Access:	https://www.mdpi.com/2306-5729/8/11/169

_version_	1797459584479133696
author	Jomark Pablo Noriega Luis Antonio Rivera José Alfredo Herrera
author_facet	Jomark Pablo Noriega Luis Antonio Rivera José Alfredo Herrera
author_sort	Jomark Pablo Noriega
collection	DOAJ
description	In this systematic review of the literature on using Machine Learning (ML) for credit risk prediction, we raise the need for financial institutions to use Artificial Intelligence (AI) and ML to assess credit risk, analyzing large volumes of information. We posed research questions about algorithms, metrics, results, datasets, variables, and related limitations in predicting credit risk. In addition, we searched renowned databases responding to them and identified 52 relevant studies within the credit industry of microfinance. Challenges and approaches in credit risk prediction using ML models were identified; we had difficulties with the implemented models such as the black box model, the need for explanatory artificial intelligence, the importance of selecting relevant features, addressing multicollinearity, and the problem of the imbalance in the input data. By answering the inquiries, we identified that the Boosted Category is the most researched family of ML models; the most commonly used metrics for evaluation are Area Under Curve (AUC), Accuracy (ACC), Recall, precision measure F1 (F1), and Precision. Research mainly uses public datasets to compare models, and private ones to generate new knowledge when applied to the real world. The most significant limitation identified is the representativeness of reality, and the variables primarily used in the microcredit industry are data related to the Demographic, Operation, and Payment behavior. This study aims to guide developers of credit risk management tools and software towards the existing ability of ML methods, metrics, and techniques used to forecast it, thereby minimizing possible losses due to default and guiding risk appetite.
first_indexed	2024-03-09T16:54:28Z
format	Article
id	doaj.art-1d384bf6ef1c4b858c6106a8fe2402db
institution	Directory Open Access Journal
issn	2306-5729
language	English
last_indexed	2024-03-09T16:54:28Z
publishDate	2023-11-01
publisher	MDPI AG
record_format	Article
series	Data
spelling	doaj.art-1d384bf6ef1c4b858c6106a8fe2402db2023-11-24T14:37:19ZengMDPI AGData2306-57292023-11-0181116910.3390/data8110169Machine Learning for Credit Risk Prediction: A Systematic Literature ReviewJomark Pablo Noriega0Luis Antonio Rivera1José Alfredo Herrera2Departamento Académico de Ciencia de la Computacion, Universidad Nacional Mayor de San Marcos, Decana de América, Lima 15081, PeruDepartamento Académico de Ciencia de la Computacion, Universidad Nacional Mayor de San Marcos, Decana de América, Lima 15081, PeruDepartamento Académico de Ciencia de la Computacion, Universidad Nacional Mayor de San Marcos, Decana de América, Lima 15081, PeruIn this systematic review of the literature on using Machine Learning (ML) for credit risk prediction, we raise the need for financial institutions to use Artificial Intelligence (AI) and ML to assess credit risk, analyzing large volumes of information. We posed research questions about algorithms, metrics, results, datasets, variables, and related limitations in predicting credit risk. In addition, we searched renowned databases responding to them and identified 52 relevant studies within the credit industry of microfinance. Challenges and approaches in credit risk prediction using ML models were identified; we had difficulties with the implemented models such as the black box model, the need for explanatory artificial intelligence, the importance of selecting relevant features, addressing multicollinearity, and the problem of the imbalance in the input data. By answering the inquiries, we identified that the Boosted Category is the most researched family of ML models; the most commonly used metrics for evaluation are Area Under Curve (AUC), Accuracy (ACC), Recall, precision measure F1 (F1), and Precision. Research mainly uses public datasets to compare models, and private ones to generate new knowledge when applied to the real world. The most significant limitation identified is the representativeness of reality, and the variables primarily used in the microcredit industry are data related to the Demographic, Operation, and Payment behavior. This study aims to guide developers of credit risk management tools and software towards the existing ability of ML methods, metrics, and techniques used to forecast it, thereby minimizing possible losses due to default and guiding risk appetite.https://www.mdpi.com/2306-5729/8/11/169loancredit riskpredictionmachine learningsystematic literature review
spellingShingle	Jomark Pablo Noriega Luis Antonio Rivera José Alfredo Herrera Machine Learning for Credit Risk Prediction: A Systematic Literature Review Data loan credit risk prediction machine learning systematic literature review
title	Machine Learning for Credit Risk Prediction: A Systematic Literature Review
title_full	Machine Learning for Credit Risk Prediction: A Systematic Literature Review
title_fullStr	Machine Learning for Credit Risk Prediction: A Systematic Literature Review
title_full_unstemmed	Machine Learning for Credit Risk Prediction: A Systematic Literature Review
title_short	Machine Learning for Credit Risk Prediction: A Systematic Literature Review
title_sort	machine learning for credit risk prediction a systematic literature review
topic	loan credit risk prediction machine learning systematic literature review
url	https://www.mdpi.com/2306-5729/8/11/169
work_keys_str_mv	AT jomarkpablonoriega machinelearningforcreditriskpredictionasystematicliteraturereview AT luisantoniorivera machinelearningforcreditriskpredictionasystematicliteraturereview AT josealfredoherrera machinelearningforcreditriskpredictionasystematicliteraturereview

Machine Learning for Credit Risk Prediction: A Systematic Literature Review

Similar Items