Machine learning based characters recognition

This final year project aims to study and implement some machine learning techniques for character recognition. The author was tasked to develop a mobile app for a business card scanner based on these techniques. The author has chosen to do research on Tesseract, which is an open-source optical char...

Full description

Bibliographic Details
Main Author: Song, Tianyi
Other Authors: Huang Guangbin
Format: Final Year Project (FYP)
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/75485
_version_ 1811693653174779904
author Song, Tianyi
author2 Huang Guangbin
author_facet Huang Guangbin
Song, Tianyi
author_sort Song, Tianyi
collection NTU
description This final year project aims to study and implement some machine learning techniques for character recognition. The author was tasked to develop a mobile app for a business card scanner based on these techniques. The author has chosen to do research on Tesseract, which is an open-source optical character recognition (OCR) engine sponsored by Google and has embedded the Tess-two library locally into the business card scanner. The scanner was developed for Android systems. It is able to scan characters on business cards, distinguish the information and save it into the entry attributes for a new contact. It includes functions of photo cropping and saving, character recognition, information extraction and contact adding. The app design, app structure, key codes and testing results will be included in this report. Since OCR is the key technology of the application, its principle and development will be discussed for basic understanding as well as future improvement of the scanner.
first_indexed 2024-10-01T06:55:06Z
format Final Year Project (FYP)
id ntu-10356/75485
institution Nanyang Technological University
language English
last_indexed 2024-10-01T06:55:06Z
publishDate 2018
record_format dspace
spelling ntu-10356/754852023-07-07T16:19:53Z Machine learning based characters recognition Song, Tianyi Huang Guangbin School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering This final year project aims to study and implement some machine learning techniques for character recognition. The author was tasked to develop a mobile app for a business card scanner based on these techniques. The author has chosen to do research on Tesseract, which is an open-source optical character recognition (OCR) engine sponsored by Google and has embedded the Tess-two library locally into the business card scanner. The scanner was developed for Android systems. It is able to scan characters on business cards, distinguish the information and save it into the entry attributes for a new contact. It includes functions of photo cropping and saving, character recognition, information extraction and contact adding. The app design, app structure, key codes and testing results will be included in this report. Since OCR is the key technology of the application, its principle and development will be discussed for basic understanding as well as future improvement of the scanner. Bachelor of Engineering 2018-05-31T08:34:57Z 2018-05-31T08:34:57Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/75485 en Nanyang Technological University 42 p. application/pdf
spellingShingle DRNTU::Engineering::Electrical and electronic engineering
Song, Tianyi
Machine learning based characters recognition
title Machine learning based characters recognition
title_full Machine learning based characters recognition
title_fullStr Machine learning based characters recognition
title_full_unstemmed Machine learning based characters recognition
title_short Machine learning based characters recognition
title_sort machine learning based characters recognition
topic DRNTU::Engineering::Electrical and electronic engineering
url http://hdl.handle.net/10356/75485
work_keys_str_mv AT songtianyi machinelearningbasedcharactersrecognition