Comparative analysis of text classification algorithms for automated labelling of quranic verses

The ultimate goal of labelling a Quranic verse is to determine its corresponding theme. However, the existing Quranic verse labelling approach is primarily depending on the availability of Quranic scholars who have expertise in Arabic language and Tafseer. In this paper, we propose to automate the l...

Full description

Bibliographic Details
Main Authors: Adeleke, Abdullah, Samsudin, Noor Azah, Mustapha, Aida, Mohd Nawi, Nazri
Format: Article
Language:English
Published: Insight - Indonesian Society for Knowledge and Human Development 2017
Subjects:
Online Access:http://eprints.uthm.edu.my/3423/1/AJ%202017%20%28487%29.pdf
Description
Summary:The ultimate goal of labelling a Quranic verse is to determine its corresponding theme. However, the existing Quranic verse labelling approach is primarily depending on the availability of Quranic scholars who have expertise in Arabic language and Tafseer. In this paper, we propose to automate the labelling task of the Quranic verse using text classification algorithms. We applied three text classification algorithms namely, k-Nearest Neighbour, Support Vector Machine, and Naïve Bayes in automating the labelling procedure. In our experiment with the classification algorithms English translation of the verses are presented as features. The English translation of the verses are then classified as “Shahadah” (the first pillar of Islam) or “Pray” (the second pillar of Islam). It is found that all of the text classification algorithms are capable to achieve more than 70% accuracy in labelling the Quranic verses.