A case study of using AI for General Certificate of Secondary Education (GCSE) grade prediction in a selective independent school in England

The COVID-19 pandemic has created significant challenges for UK schools, but a time of cancelled exams and uncertainty around future examinations can provide opportunities to explore novel assessment methods. Hence, the 2020 proposal of the Ofqual algorithm which combines teachers' estimated gr...

Full description

Bibliographic Details
Main Author: Gyorgy Denes
Format: Article
Language:English
Published: Elsevier 2023-01-01
Series:Computers and Education: Artificial Intelligence
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666920X23000085
Description
Summary:The COVID-19 pandemic has created significant challenges for UK schools, but a time of cancelled exams and uncertainty around future examinations can provide opportunities to explore novel assessment methods. Hence, the 2020 proposal of the Ofqual algorithm which combines teachers' estimated grades and schools' historical performance seemed timely. However, the algorithmically calculated grades resulted in a public backlash and withdrawal of the proposal. While the failed Ofqual algorithm could be considered an example of AI, we do not yet have a thorough understanding of its numerical accuracy and how it performs in comparison to other AI models. This paper investigates this novel application: the potential use of a range of AI models as assessment tools in a selective, independent, secondary school in England. The following questions were examined: (1) how accurate are modern AI models in predicting GCSE exam grades? (2) what are the differences in model accuracy across subjects and can these be explained by qualitative differences in teachers' grading practices? Results indicate that while models yield acceptable mean absolute errors, individual mispredictions can be larger than desired. Subject differences highlighted that grading subjectivity is less significant in science, technology, engineering, and maths (STEM) subjects, which could explain why objective models fail to predict non-STEM grades more frequently. In summary, numerical results indicate that grade prediction could be an interesting novel application of AI, but more research is needed to reduce outliers.
ISSN:2666-920X