Arabic Questions Classification Using Modified TF-IDF

Classifying the cognitive levels of assessment questions according to Bloom’s taxonomy can help instructors design effective assessments that are well aligned with the intended learning outcomes. However, the classification process is time consuming and requires experience. Many studies h...

Full description

Bibliographic Details
Main Author: Ali Saleh Alammary
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9469876/
_version_ 1818652160158597120
author Ali Saleh Alammary
author_facet Ali Saleh Alammary
author_sort Ali Saleh Alammary
collection DOAJ
description Classifying the cognitive levels of assessment questions according to Bloom’s taxonomy can help instructors design effective assessments that are well aligned with the intended learning outcomes. However, the classification process is time consuming and requires experience. Many studies have attempted to automate the process by utilizing different machine learning and text mining approaches, but none has examined the classification of Arabic questions. The purpose of this study is to examine this research gap and to introduce a new feature extraction method that would better suit Arabic questions and their unique characteristics. It also aims to provide Arab instructors with a tool that can help them automatically classify their assessment questions. To accomplish this purpose, the study developed a dataset of more than 600 Arabic assessment questions. It then proposed a modified term frequency-inverse document frequency (TF-IDF) method for extracting features from Arabic questions. Unlike the traditional TF-IDF, the proposed method was designed to take the nature of assessment questions into consideration. It was evaluated by comparing it to two methods that have been used for classifying English questions, i.e., the traditional TF-IDF and a modified TF-IDF method called term frequency part-of-speech-inverse document frequency (TFPOS-IDF). A t-test was utilized to examine whether the difference in performance between the three methods was statistically significant. The proposed method outperformed the two other methods. The overall accuracy, precision, and recall scored by the proposed method were significantly higher than those scored by the traditional TF-IDF and TFPOS-IDF methods. The evaluation results indicate the promising potential of the proposed method, which can be extended to other languages.
first_indexed 2024-12-17T02:17:35Z
format Article
id doaj.art-bb8e9f733b9345acad167533b08b44d7
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-17T02:17:35Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-bb8e9f733b9345acad167533b08b44d72022-12-21T22:07:21ZengIEEEIEEE Access2169-35362021-01-019951099512210.1109/ACCESS.2021.30941159469876Arabic Questions Classification Using Modified TF-IDFAli Saleh Alammary0https://orcid.org/0000-0002-4186-5786College of Computing and Informatics, Saudi Electronic University, Riyadh, Saudi ArabiaClassifying the cognitive levels of assessment questions according to Bloom’s taxonomy can help instructors design effective assessments that are well aligned with the intended learning outcomes. However, the classification process is time consuming and requires experience. Many studies have attempted to automate the process by utilizing different machine learning and text mining approaches, but none has examined the classification of Arabic questions. The purpose of this study is to examine this research gap and to introduce a new feature extraction method that would better suit Arabic questions and their unique characteristics. It also aims to provide Arab instructors with a tool that can help them automatically classify their assessment questions. To accomplish this purpose, the study developed a dataset of more than 600 Arabic assessment questions. It then proposed a modified term frequency-inverse document frequency (TF-IDF) method for extracting features from Arabic questions. Unlike the traditional TF-IDF, the proposed method was designed to take the nature of assessment questions into consideration. It was evaluated by comparing it to two methods that have been used for classifying English questions, i.e., the traditional TF-IDF and a modified TF-IDF method called term frequency part-of-speech-inverse document frequency (TFPOS-IDF). A t-test was utilized to examine whether the difference in performance between the three methods was statistically significant. The proposed method outperformed the two other methods. The overall accuracy, precision, and recall scored by the proposed method were significantly higher than those scored by the traditional TF-IDF and TFPOS-IDF methods. The evaluation results indicate the promising potential of the proposed method, which can be extended to other languages.https://ieeexplore.ieee.org/document/9469876/Arabic text classificationfeature extractionlearning analyticsmachine learningTF-IDF
spellingShingle Ali Saleh Alammary
Arabic Questions Classification Using Modified TF-IDF
IEEE Access
Arabic text classification
feature extraction
learning analytics
machine learning
TF-IDF
title Arabic Questions Classification Using Modified TF-IDF
title_full Arabic Questions Classification Using Modified TF-IDF
title_fullStr Arabic Questions Classification Using Modified TF-IDF
title_full_unstemmed Arabic Questions Classification Using Modified TF-IDF
title_short Arabic Questions Classification Using Modified TF-IDF
title_sort arabic questions classification using modified tf idf
topic Arabic text classification
feature extraction
learning analytics
machine learning
TF-IDF
url https://ieeexplore.ieee.org/document/9469876/
work_keys_str_mv AT alisalehalammary arabicquestionsclassificationusingmodifiedtfidf