Semantic Text Segmentation from Synthetic Images of Full-Text Documents
An algorithm (divided into multiple modules) for generating images of full-text documents is presented. These images can be used to train, test, and evaluate models for Optical Character Recognition (OCR). The algorithm is modular, individual parts can be changed and tweaked to generate desired i...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Russian Academy of Sciences, St. Petersburg Federal Research Center
2019-12-01
|
Series: | Информатика и автоматизация |
Subjects: | |
Online Access: | http://ia.spcras.ru/index.php/sp/article/view/4527 |