Handwritten Digit Classification in Bangla and Hindi Using Deep Learning

Handwritten digit classification is a well-known and important problem in the field of optical character recognition (OCR). The primary challenge is correctly classifying digits which are highly varied in their visual characteristics primarily due to the writing styles of different individuals. In t...

Full description

Bibliographic Details
Main Authors: Jishnu Mukhoti, Sukanya Dutta, Ram Sarkar
Format: Article
Language:English
Published: Taylor & Francis Group 2020-12-01
Series:Applied Artificial Intelligence
Online Access:http://dx.doi.org/10.1080/08839514.2020.1804228
Description
Summary:Handwritten digit classification is a well-known and important problem in the field of optical character recognition (OCR). The primary challenge is correctly classifying digits which are highly varied in their visual characteristics primarily due to the writing styles of different individuals. In this paper, we propose the use of Convolutional Neural Networks (CNN) for the purpose of classifying handwritten Bangla and Hindi numerals. The major advantage that we face by using a CNN-based classifier is that no prior hand-crafted feature needs to be extracted from the images for efficient and accurate classification. An added benefit of a CNN classifier is that it provides translational invariance and a certain extent of rotational invariance during recognition. Applications can be found in real-time OCR systems where input images are often not perfectly oriented along a vertical axis. In this work, we use modified versions of the well-known LeNet CNN architecture. Extensive experiments have revealed a best-case classification accuracy of 98.2% for Bangla and 98.8% for Hindi numerals outperforming competitive models in the literature.
ISSN:0883-9514
1087-6545