Summary: | This paper proposed LONTAR_DETC, a method to detect handwritten Balinese characters in <italic>Lontar</italic> manuscripts. LONTAR_DETC is a deep learning architecture based on YOLO. The detection of Balinese characters in <italic>Lontar</italic> manuscripts is challenging due to the characteristics of Balinese characters in Lontar manuscripts. Balinese characters in <italic>Lontar</italic> manuscripts are dense, overlapping, have high variance, contain noise, and classes of these characters are imbalanced. The proposed method consists of three steps, namely data generation, <italic>Lontar</italic> manuscript annotation, and Balinese character detection. The first step is data generation, in which synthetic images of original <italic>Lontar</italic> manuscript images are generated with enhanced image quality. The second step is data annotation to build a new <italic>Lontar</italic> manuscript dataset. As a result, we also propose the Handwritten Balinese Character of <italic>Lontar</italic> manuscript (HBCL_DETC) dataset, a novel Balinese character in <italic>Lontar</italic> manuscripts dataset. HBCL_DETC contains 600 images that consists of more than 100,000 Balinese characters annotated by experts. Finally, the third step is training the YOLOv4 detection model using the HBCL_DETC dataset. We created this dataset specifically for the task of detecting Balinese characters in <italic>Lontar</italic> manuscripts. To evaluate the reliability of the dataset, we experimented with three scenarios. In the first scenario, the detection model was trained using original images of <italic>Lontar</italic> manuscripts, in the second scenario the detection model was trained with the addition of augmented grayscale images, and in the third scenario the detection model was trained using HBCL_DETC. Based on the experimental results, LONTAR_DETC can detect Balinese characters with high detection rate with mAP of 99.55%.
|