Optical Character Recognition for Quranic Image Similarity Matching
The detection and recognition and then conversion of the characters in an image into a text are called optical character recognition (OCR). A distinctive-type of OCR is used to process Arabic characters, namely, Arabic OCR. OCR is increasingly used in many applications, where this process is preferr...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2018-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8101474/ |
_version_ | 1828929078344810496 |
---|---|
author | Faiz Alotaibi Muhamad Taufik Abdullah Rusli Bin Hj Abdullah Rahmita Wirza Binti O. K. Rahmat Ibrahim Abaker Targio Hashem Arun Kumar Sangaiah |
author_facet | Faiz Alotaibi Muhamad Taufik Abdullah Rusli Bin Hj Abdullah Rahmita Wirza Binti O. K. Rahmat Ibrahim Abaker Targio Hashem Arun Kumar Sangaiah |
author_sort | Faiz Alotaibi |
collection | DOAJ |
description | The detection and recognition and then conversion of the characters in an image into a text are called optical character recognition (OCR). A distinctive-type of OCR is used to process Arabic characters, namely, Arabic OCR. OCR is increasingly used in many applications, where this process is preferred to automatically perform a process without human association. The Quranic text contains two elements, namely, diacritics and characters. However, processing these elements may cause malfunction to the OCR system and reduce its level of accuracy. In this paper, a new method is proposed to check the similarity and originality of Quranic content. This method is based on a combination of Quranic diacritic and character recognition techniques. Diacritic detections are performed using a region-based algorithm. An optimization technique is applied to increase the recognition ratio. Moreover, character recognition is performed based on the projection method. An optimization technique is applied to increase the recognition ratio. The result of the proposed method is compared with the standard Mushaf al Madinah benchmark to find similarities that match with texts of the Holy Quran. The obtained accuracy was superior to the other tested K-nearest neighbor (knn) algorithm and published results in the literature. The accuracies were 96.4286% and 92.3077% better in the improved knn algorithm for diacritics and characters, respectively, than in the knn algorithm. |
first_indexed | 2024-12-14T00:12:58Z |
format | Article |
id | doaj.art-e241a8171fe74bc3be7f1fb1f22395b7 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-14T00:12:58Z |
publishDate | 2018-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-e241a8171fe74bc3be7f1fb1f22395b72022-12-21T23:25:40ZengIEEEIEEE Access2169-35362018-01-01655456210.1109/ACCESS.2017.27716218101474Optical Character Recognition for Quranic Image Similarity MatchingFaiz Alotaibi0Muhamad Taufik Abdullah1Rusli Bin Hj Abdullah2Rahmita Wirza Binti O. K. Rahmat3Ibrahim Abaker Targio Hashem4https://orcid.org/0000-0001-7611-9540Arun Kumar Sangaiah5https://orcid.org/0000-0002-0229-2460Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, MalaysiaFaculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, MalaysiaFaculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, MalaysiaFaculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, MalaysiaDepartment of Computing Technology, Asia Pacific University of Technology and Innovation Technology, Kuala Lumpur, MalaysiaSchool of Computing Science and Engineering, VIT University, Vellore, IndiaThe detection and recognition and then conversion of the characters in an image into a text are called optical character recognition (OCR). A distinctive-type of OCR is used to process Arabic characters, namely, Arabic OCR. OCR is increasingly used in many applications, where this process is preferred to automatically perform a process without human association. The Quranic text contains two elements, namely, diacritics and characters. However, processing these elements may cause malfunction to the OCR system and reduce its level of accuracy. In this paper, a new method is proposed to check the similarity and originality of Quranic content. This method is based on a combination of Quranic diacritic and character recognition techniques. Diacritic detections are performed using a region-based algorithm. An optimization technique is applied to increase the recognition ratio. Moreover, character recognition is performed based on the projection method. An optimization technique is applied to increase the recognition ratio. The result of the proposed method is compared with the standard Mushaf al Madinah benchmark to find similarities that match with texts of the Holy Quran. The obtained accuracy was superior to the other tested K-nearest neighbor (knn) algorithm and published results in the literature. The accuracies were 96.4286% and 92.3077% better in the improved knn algorithm for diacritics and characters, respectively, than in the knn algorithm.https://ieeexplore.ieee.org/document/8101474/Image processingcharacter recognitionQuranic diacriticsknnoptimization |
spellingShingle | Faiz Alotaibi Muhamad Taufik Abdullah Rusli Bin Hj Abdullah Rahmita Wirza Binti O. K. Rahmat Ibrahim Abaker Targio Hashem Arun Kumar Sangaiah Optical Character Recognition for Quranic Image Similarity Matching IEEE Access Image processing character recognition Quranic diacritics knn optimization |
title | Optical Character Recognition for Quranic Image Similarity Matching |
title_full | Optical Character Recognition for Quranic Image Similarity Matching |
title_fullStr | Optical Character Recognition for Quranic Image Similarity Matching |
title_full_unstemmed | Optical Character Recognition for Quranic Image Similarity Matching |
title_short | Optical Character Recognition for Quranic Image Similarity Matching |
title_sort | optical character recognition for quranic image similarity matching |
topic | Image processing character recognition Quranic diacritics knn optimization |
url | https://ieeexplore.ieee.org/document/8101474/ |
work_keys_str_mv | AT faizalotaibi opticalcharacterrecognitionforquranicimagesimilaritymatching AT muhamadtaufikabdullah opticalcharacterrecognitionforquranicimagesimilaritymatching AT ruslibinhjabdullah opticalcharacterrecognitionforquranicimagesimilaritymatching AT rahmitawirzabintiokrahmat opticalcharacterrecognitionforquranicimagesimilaritymatching AT ibrahimabakertargiohashem opticalcharacterrecognitionforquranicimagesimilaritymatching AT arunkumarsangaiah opticalcharacterrecognitionforquranicimagesimilaritymatching |