Optical character recognition system for Baybayin scripts using support vector machine
In 2018, the Philippine Congress signed House Bill 1022 declaring the Baybayin script as the Philippines’ national writing system. In this regard, it is highly probable that the Baybayin and Latin scripts would appear in a single document. In this work, we propose a system that discriminates the cha...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
PeerJ Inc.
2021-02-01
|
Series: | PeerJ Computer Science |
Subjects: | |
Online Access: | https://peerj.com/articles/cs-360.pdf |
_version_ | 1818955135013879808 |
---|---|
author | Rodney Pino Renier Mendoza Rachelle Sambayan |
author_facet | Rodney Pino Renier Mendoza Rachelle Sambayan |
author_sort | Rodney Pino |
collection | DOAJ |
description | In 2018, the Philippine Congress signed House Bill 1022 declaring the Baybayin script as the Philippines’ national writing system. In this regard, it is highly probable that the Baybayin and Latin scripts would appear in a single document. In this work, we propose a system that discriminates the characters of both scripts. The proposed system considers the normalization of an individual character to identify if it belongs to Baybayin or Latin script and further classify them as to what unit they represent. This gives us four classification problems, namely: (1) Baybayin and Latin script recognition, (2) Baybayin character classification, (3) Latin character classification, and (4) Baybayin diacritical marks classification. To the best of our knowledge, this is the first study that makes use of Support Vector Machine (SVM) for Baybayin script recognition. This work also provides a new dataset for Baybayin, its diacritics, and Latin characters. Classification problems (1) and (4) use binary SVM while (2) and (3) apply the multiclass SVM classification. On average, our numerical experiments yield satisfactory results: (1) has 98.5% accuracy, 98.5% precision, 98.49% recall, and 98.5% F1 Score; (2) has 96.51% accuracy, 95.62% precision, 95.61% recall, and 95.62% F1 Score; (3) has 95.8% accuracy, 95.85% precision, 95.8% recall, and 95.83% F1 Score; and (4) has 100% accuracy, 100% precision, 100% recall, and 100% F1 Score. |
first_indexed | 2024-12-20T10:33:14Z |
format | Article |
id | doaj.art-9598dac29dc544e7b584499934b0b8e0 |
institution | Directory Open Access Journal |
issn | 2376-5992 |
language | English |
last_indexed | 2024-12-20T10:33:14Z |
publishDate | 2021-02-01 |
publisher | PeerJ Inc. |
record_format | Article |
series | PeerJ Computer Science |
spelling | doaj.art-9598dac29dc544e7b584499934b0b8e02022-12-21T19:43:41ZengPeerJ Inc.PeerJ Computer Science2376-59922021-02-017e36010.7717/peerj-cs.360Optical character recognition system for Baybayin scripts using support vector machineRodney Pino0Renier Mendoza1Rachelle Sambayan2Institute of Mathematics, University of the Philippines Diliman, Quezon City, Metro Manila, PhilippinesInstitute of Mathematics, University of the Philippines Diliman, Quezon City, Metro Manila, PhilippinesInstitute of Mathematics, University of the Philippines Diliman, Quezon City, Metro Manila, PhilippinesIn 2018, the Philippine Congress signed House Bill 1022 declaring the Baybayin script as the Philippines’ national writing system. In this regard, it is highly probable that the Baybayin and Latin scripts would appear in a single document. In this work, we propose a system that discriminates the characters of both scripts. The proposed system considers the normalization of an individual character to identify if it belongs to Baybayin or Latin script and further classify them as to what unit they represent. This gives us four classification problems, namely: (1) Baybayin and Latin script recognition, (2) Baybayin character classification, (3) Latin character classification, and (4) Baybayin diacritical marks classification. To the best of our knowledge, this is the first study that makes use of Support Vector Machine (SVM) for Baybayin script recognition. This work also provides a new dataset for Baybayin, its diacritics, and Latin characters. Classification problems (1) and (4) use binary SVM while (2) and (3) apply the multiclass SVM classification. On average, our numerical experiments yield satisfactory results: (1) has 98.5% accuracy, 98.5% precision, 98.49% recall, and 98.5% F1 Score; (2) has 96.51% accuracy, 95.62% precision, 95.61% recall, and 95.62% F1 Score; (3) has 95.8% accuracy, 95.85% precision, 95.8% recall, and 95.83% F1 Score; and (4) has 100% accuracy, 100% precision, 100% recall, and 100% F1 Score.https://peerj.com/articles/cs-360.pdfBaybayinLatin script identificationBaybayin script identificationSupport vector machineOptical character recognition |
spellingShingle | Rodney Pino Renier Mendoza Rachelle Sambayan Optical character recognition system for Baybayin scripts using support vector machine PeerJ Computer Science Baybayin Latin script identification Baybayin script identification Support vector machine Optical character recognition |
title | Optical character recognition system for Baybayin scripts using support vector machine |
title_full | Optical character recognition system for Baybayin scripts using support vector machine |
title_fullStr | Optical character recognition system for Baybayin scripts using support vector machine |
title_full_unstemmed | Optical character recognition system for Baybayin scripts using support vector machine |
title_short | Optical character recognition system for Baybayin scripts using support vector machine |
title_sort | optical character recognition system for baybayin scripts using support vector machine |
topic | Baybayin Latin script identification Baybayin script identification Support vector machine Optical character recognition |
url | https://peerj.com/articles/cs-360.pdf |
work_keys_str_mv | AT rodneypino opticalcharacterrecognitionsystemforbaybayinscriptsusingsupportvectormachine AT reniermendoza opticalcharacterrecognitionsystemforbaybayinscriptsusingsupportvectormachine AT rachellesambayan opticalcharacterrecognitionsystemforbaybayinscriptsusingsupportvectormachine |