Neural OCR Post-Hoc Correction of Historical Corpora

AbstractOptical character recognition (OCR) is crucial for a deeper access to historical collections. OCR needs to account for orthographic variations, typefaces, or language evolution (i.e., new letters, word spellings), as the main source of character, word, or word segmentation tr...

Full description

Bibliographic Details
Main Authors: Lijun Lyu, Maria Koutraki, Martin Krickl, Besnik Fetahu
Format: Article
Language:English
Published: The MIT Press 2021-01-01
Series:Transactions of the Association for Computational Linguistics
Online Access:https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00379/100788/Neural-OCR-Post-Hoc-Correction-of-Historical

Similar Items