Model for the Prediction of Dropout in Higher Education in Peru applying Machine Learning Algorithms: Random Forest, Decision Tree, Neural Network and Support Vector Machine

University dropout is a problem that not only affects students, but also families, universities, society, and others. This problem has a global character, so it is common to identify it in different parts of the world. However, there are few solutions that efficiently take advantage of available tec...

Full description

Bibliographic Details
Main Authors: Omar A Jimenez, Ashley Jesús Llontop, Lenis Wong
Format: Article
Language:English
Published: FRUCT 2023-05-01
Series:Proceedings of the XXth Conference of Open Innovations Association FRUCT
Subjects:
Online Access:https://www.fruct.org/publications/volume-33/fruct33/files/Jim.pdf
_version_ 1797807524028612608
author Omar A Jimenez
Ashley Jesús Llontop
Lenis Wong
author_facet Omar A Jimenez
Ashley Jesús Llontop
Lenis Wong
author_sort Omar A Jimenez
collection DOAJ
description University dropout is a problem that not only affects students, but also families, universities, society, and others. This problem has a global character, so it is common to identify it in different parts of the world. However, there are few solutions that efficiently take advantage of available technology and information. Therefore, this study implements a predictive analysis model to identify students at risk of dropout in Peruvian universities and the variables that influence it. For this purpose, the Cross Industry Standard Process for Data Mining (CRISP - DM) methodology is used to develop the model and four Machine Learning algorithms. The methodology consists of five phases: business understanding, data understanding, data preparation, modeling, and evaluation. The experiment was carried out by conducting a survey to 385 students from different public and private universities in Peru, where cognitive, affective, family environment, pre-university, career and university variables were considered. The results showed that the most influential variables in the prediction of university dropout were "age", "term" and the student's "financing method". We also found that the Random Forest algorithm obtained the best performance, with an AUC of 0.9623 in the prediction of college dropout.
first_indexed 2024-03-13T06:23:50Z
format Article
id doaj.art-7ee5d216a5464fef9d5396fa5f8a1c89
institution Directory Open Access Journal
issn 2305-7254
2343-0737
language English
last_indexed 2024-03-13T06:23:50Z
publishDate 2023-05-01
publisher FRUCT
record_format Article
series Proceedings of the XXth Conference of Open Innovations Association FRUCT
spelling doaj.art-7ee5d216a5464fef9d5396fa5f8a1c892023-06-09T11:41:51ZengFRUCTProceedings of the XXth Conference of Open Innovations Association FRUCT2305-72542343-07372023-05-0133111612410.23919/FRUCT58615.2023.10143068Model for the Prediction of Dropout in Higher Education in Peru applying Machine Learning Algorithms: Random Forest, Decision Tree, Neural Network and Support Vector MachineOmar A Jimenez0Ashley Jesús Llontop1Lenis Wong2Universidad Peruana de Ciencias AplicadasUniversidad Peruana de Ciencias AplicadasUniversidad Peruana de Ciencias AplicadasUniversity dropout is a problem that not only affects students, but also families, universities, society, and others. This problem has a global character, so it is common to identify it in different parts of the world. However, there are few solutions that efficiently take advantage of available technology and information. Therefore, this study implements a predictive analysis model to identify students at risk of dropout in Peruvian universities and the variables that influence it. For this purpose, the Cross Industry Standard Process for Data Mining (CRISP - DM) methodology is used to develop the model and four Machine Learning algorithms. The methodology consists of five phases: business understanding, data understanding, data preparation, modeling, and evaluation. The experiment was carried out by conducting a survey to 385 students from different public and private universities in Peru, where cognitive, affective, family environment, pre-university, career and university variables were considered. The results showed that the most influential variables in the prediction of university dropout were "age", "term" and the student's "financing method". We also found that the Random Forest algorithm obtained the best performance, with an AUC of 0.9623 in the prediction of college dropout.https://www.fruct.org/publications/volume-33/fruct33/files/Jim.pdfdropout machine learning random forest decision tree neural network support vector machine
spellingShingle Omar A Jimenez
Ashley Jesús Llontop
Lenis Wong
Model for the Prediction of Dropout in Higher Education in Peru applying Machine Learning Algorithms: Random Forest, Decision Tree, Neural Network and Support Vector Machine
Proceedings of the XXth Conference of Open Innovations Association FRUCT
dropout machine learning random forest decision tree neural network support vector machine
title Model for the Prediction of Dropout in Higher Education in Peru applying Machine Learning Algorithms: Random Forest, Decision Tree, Neural Network and Support Vector Machine
title_full Model for the Prediction of Dropout in Higher Education in Peru applying Machine Learning Algorithms: Random Forest, Decision Tree, Neural Network and Support Vector Machine
title_fullStr Model for the Prediction of Dropout in Higher Education in Peru applying Machine Learning Algorithms: Random Forest, Decision Tree, Neural Network and Support Vector Machine
title_full_unstemmed Model for the Prediction of Dropout in Higher Education in Peru applying Machine Learning Algorithms: Random Forest, Decision Tree, Neural Network and Support Vector Machine
title_short Model for the Prediction of Dropout in Higher Education in Peru applying Machine Learning Algorithms: Random Forest, Decision Tree, Neural Network and Support Vector Machine
title_sort model for the prediction of dropout in higher education in peru applying machine learning algorithms random forest decision tree neural network and support vector machine
topic dropout machine learning random forest decision tree neural network support vector machine
url https://www.fruct.org/publications/volume-33/fruct33/files/Jim.pdf
work_keys_str_mv AT omarajimenez modelforthepredictionofdropoutinhighereducationinperuapplyingmachinelearningalgorithmsrandomforestdecisiontreeneuralnetworkandsupportvectormachine
AT ashleyjesusllontop modelforthepredictionofdropoutinhighereducationinperuapplyingmachinelearningalgorithmsrandomforestdecisiontreeneuralnetworkandsupportvectormachine
AT leniswong modelforthepredictionofdropoutinhighereducationinperuapplyingmachinelearningalgorithmsrandomforestdecisiontreeneuralnetworkandsupportvectormachine