Detection, recognition and understanding document layout

Effective management of personal finances is essential for financial stability. The traditional methods of expense tracking require manually inputting data into budgeting applications which are cumbersome and error prone. To encourage individuals to manage their personal finances, this project...

Full description

Bibliographic Details
Main Author: Loh, Yi Ze
Other Authors: Loke Yuan Ren
Format: Final Year Project (FYP)
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175020
_version_ 1826129836146425856
author Loh, Yi Ze
author2 Loke Yuan Ren
author_facet Loke Yuan Ren
Loh, Yi Ze
author_sort Loh, Yi Ze
collection NTU
description Effective management of personal finances is essential for financial stability. The traditional methods of expense tracking require manually inputting data into budgeting applications which are cumbersome and error prone. To encourage individuals to manage their personal finances, this project seeks to leverage advancements in DocumentAI to automate the extraction of key information from receipts. In this project, experiments were carried out with LayoutLMv3 and Donut models to determine a suitable approach to tackle this problem. Donut was chosen as the solution due to its end-to-end approach and entity linking capabilities. The first fine-tuned Donut model achieved F1 Score of 54% and Tree Edit Distance accuracy of 49%. To improve the performance of the model, data augmentation techniques were employed to increase the size of the dataset used for training. The second fine-tuned Donut model achieved F1 Score of 95% and Tree Edit Distance accuracy of 87%. To enable users to upload receipts and extract information for expense tracking, a Receipt Extraction bot was developed using Telegram API and MongoDB Atlas. The scope of this project includes comprehensive literature review on DocumentAI models, experiments on publicly available datasets, model fine-tuning and software development stages.
first_indexed 2024-10-01T07:46:45Z
format Final Year Project (FYP)
id ntu-10356/175020
institution Nanyang Technological University
language English
last_indexed 2024-10-01T07:46:45Z
publishDate 2024
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1750202024-04-19T15:46:17Z Detection, recognition and understanding document layout Loh, Yi Ze Loke Yuan Ren School of Computer Science and Engineering yrloke@ntu.edu.sg Computer and Information Science Effective management of personal finances is essential for financial stability. The traditional methods of expense tracking require manually inputting data into budgeting applications which are cumbersome and error prone. To encourage individuals to manage their personal finances, this project seeks to leverage advancements in DocumentAI to automate the extraction of key information from receipts. In this project, experiments were carried out with LayoutLMv3 and Donut models to determine a suitable approach to tackle this problem. Donut was chosen as the solution due to its end-to-end approach and entity linking capabilities. The first fine-tuned Donut model achieved F1 Score of 54% and Tree Edit Distance accuracy of 49%. To improve the performance of the model, data augmentation techniques were employed to increase the size of the dataset used for training. The second fine-tuned Donut model achieved F1 Score of 95% and Tree Edit Distance accuracy of 87%. To enable users to upload receipts and extract information for expense tracking, a Receipt Extraction bot was developed using Telegram API and MongoDB Atlas. The scope of this project includes comprehensive literature review on DocumentAI models, experiments on publicly available datasets, model fine-tuning and software development stages. Bachelor's degree 2024-04-18T08:19:33Z 2024-04-18T08:19:33Z 2024 Final Year Project (FYP) Loh, Y. Z. (2024). Detection, recognition and understanding document layout. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175020 https://hdl.handle.net/10356/175020 en SCSE23-0563 application/pdf Nanyang Technological University
spellingShingle Computer and Information Science
Loh, Yi Ze
Detection, recognition and understanding document layout
title Detection, recognition and understanding document layout
title_full Detection, recognition and understanding document layout
title_fullStr Detection, recognition and understanding document layout
title_full_unstemmed Detection, recognition and understanding document layout
title_short Detection, recognition and understanding document layout
title_sort detection recognition and understanding document layout
topic Computer and Information Science
url https://hdl.handle.net/10356/175020
work_keys_str_mv AT lohyize detectionrecognitionandunderstandingdocumentlayout