OCR Correction for Corpus-assisted Discourse Studies: A Case Study of Old Newspapers
The use of OCR software to convert printed characters to digital text is a fundamental tool within diachronic approaches to Corpus-assisted discourse Studies. However, OCR software is not totally accurate, and the resulting error rate may compromise the qualitative analysis of the studies. This pape...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
University of Bologna
2022-01-01
|
Series: | Umanistica Digitale |
Subjects: | |
Online Access: | https://umanisticadigitale.unibo.it/article/view/13689 |