Optical Character Recognition for Quranic Image Similarity Matching

The detection and recognition and then conversion of the characters in an image into a text are called optical character recognition (OCR). A distinctive-type of OCR is used to process Arabic characters, namely, Arabic OCR. OCR is increasingly used in many applications, where this process is preferr...

Full description

Bibliographic Details
Main Authors: Faiz Alotaibi, Muhamad Taufik Abdullah, Rusli Bin Hj Abdullah, Rahmita Wirza Binti O. K. Rahmat, Ibrahim Abaker Targio Hashem, Arun Kumar Sangaiah
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8101474/
_version_ 1828929078344810496
author Faiz Alotaibi
Muhamad Taufik Abdullah
Rusli Bin Hj Abdullah
Rahmita Wirza Binti O. K. Rahmat
Ibrahim Abaker Targio Hashem
Arun Kumar Sangaiah
author_facet Faiz Alotaibi
Muhamad Taufik Abdullah
Rusli Bin Hj Abdullah
Rahmita Wirza Binti O. K. Rahmat
Ibrahim Abaker Targio Hashem
Arun Kumar Sangaiah
author_sort Faiz Alotaibi
collection DOAJ
description The detection and recognition and then conversion of the characters in an image into a text are called optical character recognition (OCR). A distinctive-type of OCR is used to process Arabic characters, namely, Arabic OCR. OCR is increasingly used in many applications, where this process is preferred to automatically perform a process without human association. The Quranic text contains two elements, namely, diacritics and characters. However, processing these elements may cause malfunction to the OCR system and reduce its level of accuracy. In this paper, a new method is proposed to check the similarity and originality of Quranic content. This method is based on a combination of Quranic diacritic and character recognition techniques. Diacritic detections are performed using a region-based algorithm. An optimization technique is applied to increase the recognition ratio. Moreover, character recognition is performed based on the projection method. An optimization technique is applied to increase the recognition ratio. The result of the proposed method is compared with the standard Mushaf al Madinah benchmark to find similarities that match with texts of the Holy Quran. The obtained accuracy was superior to the other tested K-nearest neighbor (knn) algorithm and published results in the literature. The accuracies were 96.4286% and 92.3077% better in the improved knn algorithm for diacritics and characters, respectively, than in the knn algorithm.
first_indexed 2024-12-14T00:12:58Z
format Article
id doaj.art-e241a8171fe74bc3be7f1fb1f22395b7
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-14T00:12:58Z
publishDate 2018-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-e241a8171fe74bc3be7f1fb1f22395b72022-12-21T23:25:40ZengIEEEIEEE Access2169-35362018-01-01655456210.1109/ACCESS.2017.27716218101474Optical Character Recognition for Quranic Image Similarity MatchingFaiz Alotaibi0Muhamad Taufik Abdullah1Rusli Bin Hj Abdullah2Rahmita Wirza Binti O. K. Rahmat3Ibrahim Abaker Targio Hashem4https://orcid.org/0000-0001-7611-9540Arun Kumar Sangaiah5https://orcid.org/0000-0002-0229-2460Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, MalaysiaFaculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, MalaysiaFaculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, MalaysiaFaculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, MalaysiaDepartment of Computing Technology, Asia Pacific University of Technology and Innovation Technology, Kuala Lumpur, MalaysiaSchool of Computing Science and Engineering, VIT University, Vellore, IndiaThe detection and recognition and then conversion of the characters in an image into a text are called optical character recognition (OCR). A distinctive-type of OCR is used to process Arabic characters, namely, Arabic OCR. OCR is increasingly used in many applications, where this process is preferred to automatically perform a process without human association. The Quranic text contains two elements, namely, diacritics and characters. However, processing these elements may cause malfunction to the OCR system and reduce its level of accuracy. In this paper, a new method is proposed to check the similarity and originality of Quranic content. This method is based on a combination of Quranic diacritic and character recognition techniques. Diacritic detections are performed using a region-based algorithm. An optimization technique is applied to increase the recognition ratio. Moreover, character recognition is performed based on the projection method. An optimization technique is applied to increase the recognition ratio. The result of the proposed method is compared with the standard Mushaf al Madinah benchmark to find similarities that match with texts of the Holy Quran. The obtained accuracy was superior to the other tested K-nearest neighbor (knn) algorithm and published results in the literature. The accuracies were 96.4286% and 92.3077% better in the improved knn algorithm for diacritics and characters, respectively, than in the knn algorithm.https://ieeexplore.ieee.org/document/8101474/Image processingcharacter recognitionQuranic diacriticsknnoptimization
spellingShingle Faiz Alotaibi
Muhamad Taufik Abdullah
Rusli Bin Hj Abdullah
Rahmita Wirza Binti O. K. Rahmat
Ibrahim Abaker Targio Hashem
Arun Kumar Sangaiah
Optical Character Recognition for Quranic Image Similarity Matching
IEEE Access
Image processing
character recognition
Quranic diacritics
knn
optimization
title Optical Character Recognition for Quranic Image Similarity Matching
title_full Optical Character Recognition for Quranic Image Similarity Matching
title_fullStr Optical Character Recognition for Quranic Image Similarity Matching
title_full_unstemmed Optical Character Recognition for Quranic Image Similarity Matching
title_short Optical Character Recognition for Quranic Image Similarity Matching
title_sort optical character recognition for quranic image similarity matching
topic Image processing
character recognition
Quranic diacritics
knn
optimization
url https://ieeexplore.ieee.org/document/8101474/
work_keys_str_mv AT faizalotaibi opticalcharacterrecognitionforquranicimagesimilaritymatching
AT muhamadtaufikabdullah opticalcharacterrecognitionforquranicimagesimilaritymatching
AT ruslibinhjabdullah opticalcharacterrecognitionforquranicimagesimilaritymatching
AT rahmitawirzabintiokrahmat opticalcharacterrecognitionforquranicimagesimilaritymatching
AT ibrahimabakertargiohashem opticalcharacterrecognitionforquranicimagesimilaritymatching
AT arunkumarsangaiah opticalcharacterrecognitionforquranicimagesimilaritymatching