E-Learning at-Risk Group Prediction Considering the Semester and Realistic Factors

This study focused on predicting at-risk groups of students at the Open University (OU), a UK university that offers distance-learning courses and adult education. The research was conducted by drawing on publicly available data provided by the Open University for the year 2013–2014. The semester’s...

Full description

Bibliographic Details
Main Authors: Chenglong Zhang, Hyunchul Ahn
Format: Article
Language:English
Published: MDPI AG 2023-11-01
Series:Education Sciences
Subjects:
Online Access:https://www.mdpi.com/2227-7102/13/11/1130
_version_ 1797459549928554496
author Chenglong Zhang
Hyunchul Ahn
author_facet Chenglong Zhang
Hyunchul Ahn
author_sort Chenglong Zhang
collection DOAJ
description This study focused on predicting at-risk groups of students at the Open University (OU), a UK university that offers distance-learning courses and adult education. The research was conducted by drawing on publicly available data provided by the Open University for the year 2013–2014. The semester’s time series was considered, and data from previous semesters were used to predict the current semester’s results. Each course was predicted separately so that the research reflected reality as closely as possible. Three different methods for selecting training data were listed. Since the at-risk prediction results needed to be provided to the instructor every week, four representative time points during the semester were chosen to assess the predictions. Furthermore, we used eight single and three integrated machine-learning algorithms to compare the prediction results. The results show that using the same semester code course data for training saved prediction calculation time and improved the prediction accuracy at all time points. In week 16, predictions using the algorithms with the voting classifier method showed higher prediction accuracy and were more stable than predictions using a single algorithm. The prediction accuracy of this model reached 81.2% for the midterm predictions and 84% for the end-of-semester predictions. Finally, the study used the Shapley additive explanation values to explore the main predictor variables of the prediction model.
first_indexed 2024-03-09T16:52:57Z
format Article
id doaj.art-a12cc339a27c4fe0adad043ff9390ebf
institution Directory Open Access Journal
issn 2227-7102
language English
last_indexed 2024-03-09T16:52:57Z
publishDate 2023-11-01
publisher MDPI AG
record_format Article
series Education Sciences
spelling doaj.art-a12cc339a27c4fe0adad043ff9390ebf2023-11-24T14:38:40ZengMDPI AGEducation Sciences2227-71022023-11-011311113010.3390/educsci13111130E-Learning at-Risk Group Prediction Considering the Semester and Realistic FactorsChenglong Zhang0Hyunchul Ahn1College of Business Administration, Kookmin University, Seoul 02707, Republic of KoreaGraduate School of Business IT, Kookmin University, Seoul 02707, Republic of KoreaThis study focused on predicting at-risk groups of students at the Open University (OU), a UK university that offers distance-learning courses and adult education. The research was conducted by drawing on publicly available data provided by the Open University for the year 2013–2014. The semester’s time series was considered, and data from previous semesters were used to predict the current semester’s results. Each course was predicted separately so that the research reflected reality as closely as possible. Three different methods for selecting training data were listed. Since the at-risk prediction results needed to be provided to the instructor every week, four representative time points during the semester were chosen to assess the predictions. Furthermore, we used eight single and three integrated machine-learning algorithms to compare the prediction results. The results show that using the same semester code course data for training saved prediction calculation time and improved the prediction accuracy at all time points. In week 16, predictions using the algorithms with the voting classifier method showed higher prediction accuracy and were more stable than predictions using a single algorithm. The prediction accuracy of this model reached 81.2% for the midterm predictions and 84% for the end-of-semester predictions. Finally, the study used the Shapley additive explanation values to explore the main predictor variables of the prediction model.https://www.mdpi.com/2227-7102/13/11/1130at-risk predictiondropout predictionOULADvoting classifierSHAP
spellingShingle Chenglong Zhang
Hyunchul Ahn
E-Learning at-Risk Group Prediction Considering the Semester and Realistic Factors
Education Sciences
at-risk prediction
dropout prediction
OULAD
voting classifier
SHAP
title E-Learning at-Risk Group Prediction Considering the Semester and Realistic Factors
title_full E-Learning at-Risk Group Prediction Considering the Semester and Realistic Factors
title_fullStr E-Learning at-Risk Group Prediction Considering the Semester and Realistic Factors
title_full_unstemmed E-Learning at-Risk Group Prediction Considering the Semester and Realistic Factors
title_short E-Learning at-Risk Group Prediction Considering the Semester and Realistic Factors
title_sort e learning at risk group prediction considering the semester and realistic factors
topic at-risk prediction
dropout prediction
OULAD
voting classifier
SHAP
url https://www.mdpi.com/2227-7102/13/11/1130
work_keys_str_mv AT chenglongzhang elearningatriskgrouppredictionconsideringthesemesterandrealisticfactors
AT hyunchulahn elearningatriskgrouppredictionconsideringthesemesterandrealisticfactors