LONTAR&#x005F;DETC: Dense and High Variance Balinese Character Detection Method in <italic>Lontar</italic> Manuscripts

This paper proposed LONTAR&#x005F;DETC, a method to detect handwritten Balinese characters in <italic>Lontar</italic> manuscripts. LONTAR&#x005F;DETC is a deep learning architecture based on YOLO. The detection of Balinese characters in <italic>Lontar</italic> manuscr...

Full description

Bibliographic Details
Main Authors: Nanik Suciati, Ni Putu Sutramiani, Daniel Siahaan
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9694598/
Description
Summary:This paper proposed LONTAR&#x005F;DETC, a method to detect handwritten Balinese characters in <italic>Lontar</italic> manuscripts. LONTAR&#x005F;DETC is a deep learning architecture based on YOLO. The detection of Balinese characters in <italic>Lontar</italic> manuscripts is challenging due to the characteristics of Balinese characters in Lontar manuscripts. Balinese characters in <italic>Lontar</italic> manuscripts are dense, overlapping, have high variance, contain noise, and classes of these characters are imbalanced. The proposed method consists of three steps, namely data generation, <italic>Lontar</italic> manuscript annotation, and Balinese character detection. The first step is data generation, in which synthetic images of original <italic>Lontar</italic> manuscript images are generated with enhanced image quality. The second step is data annotation to build a new <italic>Lontar</italic> manuscript dataset. As a result, we also propose the Handwritten Balinese Character of <italic>Lontar</italic> manuscript (HBCL&#x005F;DETC) dataset, a novel Balinese character in <italic>Lontar</italic> manuscripts dataset. HBCL&#x005F;DETC contains 600 images that consists of more than 100,000 Balinese characters annotated by experts. Finally, the third step is training the YOLOv4 detection model using the HBCL&#x005F;DETC dataset. We created this dataset specifically for the task of detecting Balinese characters in <italic>Lontar</italic> manuscripts. To evaluate the reliability of the dataset, we experimented with three scenarios. In the first scenario, the detection model was trained using original images of <italic>Lontar</italic> manuscripts, in the second scenario the detection model was trained with the addition of augmented grayscale images, and in the third scenario the detection model was trained using HBCL&#x005F;DETC. Based on the experimental results, LONTAR&#x005F;DETC can detect Balinese characters with high detection rate with mAP of 99.55&#x0025;.
ISSN:2169-3536