Assessing English language sentences readability using machine learning models

Readability is an active field of research in the late nineteenth century and vigorously persuaded to date. The recent boom in data-driven machine learning has created a viable path forward for readability classification and ranking. The evaluation of text readability is a time-honoured issue with e...

Full description

Bibliographic Details
Main Authors:	Shazia Maqsood, Abdul Shahid, Muhammad Tanvir Afzal, Muhammad Roman, Zahid Khan, Zubair Nawaz, Muhammad Haris Aziz
Format:	Article
Language:	English
Published:	PeerJ Inc. 2022-01-01
Series:	PeerJ Computer Science
Subjects:	Sentence readability Flesch-Kincaid Language learning Machine learning Natural language processing
Online Access:	https://peerj.com/articles/cs-818.pdf

_version_	1798035196663037952
author	Shazia Maqsood Abdul Shahid Muhammad Tanvir Afzal Muhammad Roman Zahid Khan Zubair Nawaz Muhammad Haris Aziz
author_facet	Shazia Maqsood Abdul Shahid Muhammad Tanvir Afzal Muhammad Roman Zahid Khan Zubair Nawaz Muhammad Haris Aziz
author_sort	Shazia Maqsood
collection	DOAJ
description	Readability is an active field of research in the late nineteenth century and vigorously persuaded to date. The recent boom in data-driven machine learning has created a viable path forward for readability classification and ranking. The evaluation of text readability is a time-honoured issue with even more relevance in today’s information-rich world. This paper addresses the task of readability assessment for the English language. Given the input sentences, the objective is to predict its level of readability, which corresponds to the level of literacy anticipated from the target readers. This readability aspect plays a crucial role in drafting and comprehending processes of English language learning. Selecting and presenting a suitable collection of sentences for English Language Learners may play a vital role in enhancing their learning curve. In this research, we have used 30,000 English sentences for experimentation. Additionally, they have been annotated into seven different readability levels using Flesch Kincaid. Later, various experiments were conducted using five Machine Learning algorithms, i.e., KNN, SVM, LR, NB, and ANN. The classification models render excellent and stable results. The ANN model obtained an F-score of 0.95% on the test set. The developed model may be used in education setup for tasks such as language learning, assessing the reading and writing abilities of a learner.
first_indexed	2024-04-11T20:54:53Z
format	Article
id	doaj.art-9487469f6cf547ca8d8efc10af3d22c2
institution	Directory Open Access Journal
issn	2376-5992
language	English
last_indexed	2024-04-11T20:54:53Z
publishDate	2022-01-01
publisher	PeerJ Inc.
record_format	Article
series	PeerJ Computer Science
spelling	doaj.art-9487469f6cf547ca8d8efc10af3d22c22022-12-22T04:03:43ZengPeerJ Inc.PeerJ Computer Science2376-59922022-01-017e81810.7717/peerj-cs.818Assessing English language sentences readability using machine learning modelsShazia Maqsood0Abdul Shahid1Muhammad Tanvir Afzal2Muhammad Roman3Zahid Khan4Zubair Nawaz5Muhammad Haris Aziz6Institute of Computing, Kohat University of Science and Technology, Kohat, KPK, PakistanInstitute of Computing, Kohat University of Science and Technology, Kohat, KPK, PakistanNAMAL Institue of Mianwali, Mianwali, Punjab, PakistanInstitute of Computing, Kohat University of Science and Technology, Kohat, KPK, PakistanRobotics and Internet of Things Lab, Prince Sultan University, Riyadh, Saudi ArabiaDepartment of Data Science, Faculty of Computing and Information Technology, University of the Punjab, Lahore, Punjab, PakistanMechanical Engineering Department, University of Sargodha, Sargodha, Sargodha, Punjab, PakistanReadability is an active field of research in the late nineteenth century and vigorously persuaded to date. The recent boom in data-driven machine learning has created a viable path forward for readability classification and ranking. The evaluation of text readability is a time-honoured issue with even more relevance in today’s information-rich world. This paper addresses the task of readability assessment for the English language. Given the input sentences, the objective is to predict its level of readability, which corresponds to the level of literacy anticipated from the target readers. This readability aspect plays a crucial role in drafting and comprehending processes of English language learning. Selecting and presenting a suitable collection of sentences for English Language Learners may play a vital role in enhancing their learning curve. In this research, we have used 30,000 English sentences for experimentation. Additionally, they have been annotated into seven different readability levels using Flesch Kincaid. Later, various experiments were conducted using five Machine Learning algorithms, i.e., KNN, SVM, LR, NB, and ANN. The classification models render excellent and stable results. The ANN model obtained an F-score of 0.95% on the test set. The developed model may be used in education setup for tasks such as language learning, assessing the reading and writing abilities of a learner.https://peerj.com/articles/cs-818.pdfSentence readabilityFlesch-KincaidLanguage learningMachine learningNatural language processing
spellingShingle	Shazia Maqsood Abdul Shahid Muhammad Tanvir Afzal Muhammad Roman Zahid Khan Zubair Nawaz Muhammad Haris Aziz Assessing English language sentences readability using machine learning models PeerJ Computer Science Sentence readability Flesch-Kincaid Language learning Machine learning Natural language processing
title	Assessing English language sentences readability using machine learning models
title_full	Assessing English language sentences readability using machine learning models
title_fullStr	Assessing English language sentences readability using machine learning models
title_full_unstemmed	Assessing English language sentences readability using machine learning models
title_short	Assessing English language sentences readability using machine learning models
title_sort	assessing english language sentences readability using machine learning models
topic	Sentence readability Flesch-Kincaid Language learning Machine learning Natural language processing
url	https://peerj.com/articles/cs-818.pdf
work_keys_str_mv	AT shaziamaqsood assessingenglishlanguagesentencesreadabilityusingmachinelearningmodels AT abdulshahid assessingenglishlanguagesentencesreadabilityusingmachinelearningmodels AT muhammadtanvirafzal assessingenglishlanguagesentencesreadabilityusingmachinelearningmodels AT muhammadroman assessingenglishlanguagesentencesreadabilityusingmachinelearningmodels AT zahidkhan assessingenglishlanguagesentencesreadabilityusingmachinelearningmodels AT zubairnawaz assessingenglishlanguagesentencesreadabilityusingmachinelearningmodels AT muhammadharisaziz assessingenglishlanguagesentencesreadabilityusingmachinelearningmodels

Assessing English language sentences readability using machine learning models

Similar Items