Handwritten Paragraph Recognition Using Spatial Information on Russian Notebooks Dataset
Handwritten paragraph recognition is a vital aspect of handwritten document analysis, enhancing accuracy and usability across various applications. However, recognizing paragraphs in handwritten documents is challenging due to layout variations and irregularities. Spatial information, encompassing s...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
FRUCT
2023-11-01
|
Series: | Proceedings of the XXth Conference of Open Innovations Association FRUCT |
Subjects: | |
Online Access: | https://www.fruct.org/publications/volume-34/fruct34/files/Moh.pdf |
_version_ | 1797354978611822592 |
---|---|
author | Samah - Mohammed Nikolay N Teslya |
author_facet | Samah - Mohammed Nikolay N Teslya |
author_sort | Samah - Mohammed |
collection | DOAJ |
description | Handwritten paragraph recognition is a vital aspect of handwritten document analysis, enhancing accuracy and usability across various applications. However, recognizing paragraphs in handwritten documents is challenging due to layout variations and irregularities. Spatial information, encompassing spatial relationships between text elements, is essential for accurate paragraph segmentation and document comprehension. Recent works in handwritten Russian recognition have primarily focused on character and line-level recognition. This study is the first attempt on paragraph-level recognition for Russian handwriting, utilizing the Vertical Attention Network (VAN) with a hybrid attention method. Key contributions include the preparation of a unique Russian dataset at the paragraph level, containing around 2600 images with PAGE XML-encoded ground truth. The VAN model was fine-tuned for whole paragraph recognition, and comprehensive experiments were conducted, comparing its performance against alternative non-layout-aware approaches. This work advances layout-aware recognition in handwritten Russian documents, addressing an unexplored area in the field. |
first_indexed | 2024-03-08T13:56:35Z |
format | Article |
id | doaj.art-e830f47178e64439a4db802e68528dfb |
institution | Directory Open Access Journal |
issn | 2305-7254 2343-0737 |
language | English |
last_indexed | 2024-03-08T13:56:35Z |
publishDate | 2023-11-01 |
publisher | FRUCT |
record_format | Article |
series | Proceedings of the XXth Conference of Open Innovations Association FRUCT |
spelling | doaj.art-e830f47178e64439a4db802e68528dfb2024-01-15T12:32:23ZengFRUCTProceedings of the XXth Conference of Open Innovations Association FRUCT2305-72542343-07372023-11-01341113https://youtu.be/wmBpIKJHYZc10.23919/FRUCT60429.2023.10328173Handwritten Paragraph Recognition Using Spatial Information on Russian Notebooks DatasetSamah - Mohammed0Nikolay N Teslya1ITMO UniversitySPC RASHandwritten paragraph recognition is a vital aspect of handwritten document analysis, enhancing accuracy and usability across various applications. However, recognizing paragraphs in handwritten documents is challenging due to layout variations and irregularities. Spatial information, encompassing spatial relationships between text elements, is essential for accurate paragraph segmentation and document comprehension. Recent works in handwritten Russian recognition have primarily focused on character and line-level recognition. This study is the first attempt on paragraph-level recognition for Russian handwriting, utilizing the Vertical Attention Network (VAN) with a hybrid attention method. Key contributions include the preparation of a unique Russian dataset at the paragraph level, containing around 2600 images with PAGE XML-encoded ground truth. The VAN model was fine-tuned for whole paragraph recognition, and comprehensive experiments were conducted, comparing its performance against alternative non-layout-aware approaches. This work advances layout-aware recognition in handwritten Russian documents, addressing an unexplored area in the field.https://www.fruct.org/publications/volume-34/fruct34/files/Moh.pdfhandwritten paragraph recognitionspatial informationrussian notebooks dataset. |
spellingShingle | Samah - Mohammed Nikolay N Teslya Handwritten Paragraph Recognition Using Spatial Information on Russian Notebooks Dataset Proceedings of the XXth Conference of Open Innovations Association FRUCT handwritten paragraph recognition spatial information russian notebooks dataset. |
title | Handwritten Paragraph Recognition Using Spatial Information on Russian Notebooks Dataset |
title_full | Handwritten Paragraph Recognition Using Spatial Information on Russian Notebooks Dataset |
title_fullStr | Handwritten Paragraph Recognition Using Spatial Information on Russian Notebooks Dataset |
title_full_unstemmed | Handwritten Paragraph Recognition Using Spatial Information on Russian Notebooks Dataset |
title_short | Handwritten Paragraph Recognition Using Spatial Information on Russian Notebooks Dataset |
title_sort | handwritten paragraph recognition using spatial information on russian notebooks dataset |
topic | handwritten paragraph recognition spatial information russian notebooks dataset. |
url | https://www.fruct.org/publications/volume-34/fruct34/files/Moh.pdf |
work_keys_str_mv | AT samahmohammed handwrittenparagraphrecognitionusingspatialinformationonrussiannotebooksdataset AT nikolaynteslya handwrittenparagraphrecognitionusingspatialinformationonrussiannotebooksdataset |