Arabic Questions Classification Using Modified TF-IDF

Classifying the cognitive levels of assessment questions according to Bloom’s taxonomy can help instructors design effective assessments that are well aligned with the intended learning outcomes. However, the classification process is time consuming and requires experience. Many studies h...

Full description

Bibliographic Details
Main Author:	Ali Saleh Alammary
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Access
Subjects:	Arabic text classification feature extraction learning analytics machine learning TF-IDF
Online Access:	https://ieeexplore.ieee.org/document/9469876/

_version_	1818652160158597120
author	Ali Saleh Alammary
author_facet	Ali Saleh Alammary
author_sort	Ali Saleh Alammary
collection	DOAJ
description	Classifying the cognitive levels of assessment questions according to Bloom’s taxonomy can help instructors design effective assessments that are well aligned with the intended learning outcomes. However, the classification process is time consuming and requires experience. Many studies have attempted to automate the process by utilizing different machine learning and text mining approaches, but none has examined the classification of Arabic questions. The purpose of this study is to examine this research gap and to introduce a new feature extraction method that would better suit Arabic questions and their unique characteristics. It also aims to provide Arab instructors with a tool that can help them automatically classify their assessment questions. To accomplish this purpose, the study developed a dataset of more than 600 Arabic assessment questions. It then proposed a modified term frequency-inverse document frequency (TF-IDF) method for extracting features from Arabic questions. Unlike the traditional TF-IDF, the proposed method was designed to take the nature of assessment questions into consideration. It was evaluated by comparing it to two methods that have been used for classifying English questions, i.e., the traditional TF-IDF and a modified TF-IDF method called term frequency part-of-speech-inverse document frequency (TFPOS-IDF). A t-test was utilized to examine whether the difference in performance between the three methods was statistically significant. The proposed method outperformed the two other methods. The overall accuracy, precision, and recall scored by the proposed method were significantly higher than those scored by the traditional TF-IDF and TFPOS-IDF methods. The evaluation results indicate the promising potential of the proposed method, which can be extended to other languages.
first_indexed	2024-12-17T02:17:35Z
format	Article
id	doaj.art-bb8e9f733b9345acad167533b08b44d7
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-17T02:17:35Z
publishDate	2021-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-bb8e9f733b9345acad167533b08b44d72022-12-21T22:07:21ZengIEEEIEEE Access2169-35362021-01-019951099512210.1109/ACCESS.2021.30941159469876Arabic Questions Classification Using Modified TF-IDFAli Saleh Alammary0https://orcid.org/0000-0002-4186-5786College of Computing and Informatics, Saudi Electronic University, Riyadh, Saudi ArabiaClassifying the cognitive levels of assessment questions according to Bloom’s taxonomy can help instructors design effective assessments that are well aligned with the intended learning outcomes. However, the classification process is time consuming and requires experience. Many studies have attempted to automate the process by utilizing different machine learning and text mining approaches, but none has examined the classification of Arabic questions. The purpose of this study is to examine this research gap and to introduce a new feature extraction method that would better suit Arabic questions and their unique characteristics. It also aims to provide Arab instructors with a tool that can help them automatically classify their assessment questions. To accomplish this purpose, the study developed a dataset of more than 600 Arabic assessment questions. It then proposed a modified term frequency-inverse document frequency (TF-IDF) method for extracting features from Arabic questions. Unlike the traditional TF-IDF, the proposed method was designed to take the nature of assessment questions into consideration. It was evaluated by comparing it to two methods that have been used for classifying English questions, i.e., the traditional TF-IDF and a modified TF-IDF method called term frequency part-of-speech-inverse document frequency (TFPOS-IDF). A t-test was utilized to examine whether the difference in performance between the three methods was statistically significant. The proposed method outperformed the two other methods. The overall accuracy, precision, and recall scored by the proposed method were significantly higher than those scored by the traditional TF-IDF and TFPOS-IDF methods. The evaluation results indicate the promising potential of the proposed method, which can be extended to other languages.https://ieeexplore.ieee.org/document/9469876/Arabic text classificationfeature extractionlearning analyticsmachine learningTF-IDF
spellingShingle	Ali Saleh Alammary Arabic Questions Classification Using Modified TF-IDF IEEE Access Arabic text classification feature extraction learning analytics machine learning TF-IDF
title	Arabic Questions Classification Using Modified TF-IDF
title_full	Arabic Questions Classification Using Modified TF-IDF
title_fullStr	Arabic Questions Classification Using Modified TF-IDF
title_full_unstemmed	Arabic Questions Classification Using Modified TF-IDF
title_short	Arabic Questions Classification Using Modified TF-IDF
title_sort	arabic questions classification using modified tf idf
topic	Arabic text classification feature extraction learning analytics machine learning TF-IDF
url	https://ieeexplore.ieee.org/document/9469876/
work_keys_str_mv	AT alisalehalammary arabicquestionsclassificationusingmodifiedtfidf

Arabic Questions Classification Using Modified TF-IDF

Similar Items