Historical Documents and Automatic Text Recognition: Introduction

With this special issue of the Journal of Data Mining and Digital Humanities (JDMDH), we bringtogether in one single volume several experiments, projects and reflections related to automatic textrecognition applied to historical documents. More and more research projects now include automatic text a...

وصف كامل

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون: Ariane Pinche, Peter Stokes
التنسيق: مقال
اللغة:English
منشور في: Nicolas Turenne 2024-03-01
سلاسل:Journal of Data Mining and Digital Humanities
الموضوعات:
الوصول للمادة أونلاين:https://jdmdh.episciences.org/13247/pdf
_version_ 1827063303583563776
author Ariane Pinche
Peter Stokes
author_facet Ariane Pinche
Peter Stokes
author_sort Ariane Pinche
collection DOAJ
description With this special issue of the Journal of Data Mining and Digital Humanities (JDMDH), we bringtogether in one single volume several experiments, projects and reflections related to automatic textrecognition applied to historical documents. More and more research projects now include automatic text acquisition in their data processing chain, and this is true not only for projects focussed on Digital or Computational Humanities but increasingly also for those that are simply using existing digital tools as the means to an end. The increasing use of this technology has led to an automation of tasks that affects the role of the researcher in the textual production process. This new data-intensive practice makes it urgent to collect and harmonise the corpora necessary for the constitution of training sets, but also to make them available for exploitation. This special issue is therefore an opportunity to present articles combining philological and technical questions to make a scientific assessment of the use of automatic text recognition for ancient documents, its results, its contributions and the new practices induced by its use in the process of editing and exploring texts. We hope that practical aspects will be questioned on this occasion, while raising methodological challenges and its impact on research data.The special issue on Automatic Text Recognition (ATR) is therefore dedicated to providing a comprehensive overview of the use of ATR in the humanities field, particularly concerning historical documents in the early 2020s. This issue presents a fusion of engineering and philological aspects, catering to both beginners and experienced users interested in launching projects with ATR. The collection encompasses a diverse array of approaches, covering topics such as data creation or collection for training generic models, reaching specific objectives, technical and HTR machine architecture, segmentation methods, and image processing.
first_indexed 2025-02-18T20:42:32Z
format Article
id doaj.art-0aa874c87b904ad2a6b793699d0b0b80
institution Directory Open Access Journal
issn 2416-5999
language English
last_indexed 2025-02-18T20:42:32Z
publishDate 2024-03-01
publisher Nicolas Turenne
record_format Article
series Journal of Data Mining and Digital Humanities
spelling doaj.art-0aa874c87b904ad2a6b793699d0b0b802024-10-17T15:16:58ZengNicolas TurenneJournal of Data Mining and Digital Humanities2416-59992024-03-01Historical Documents and...10.46298/jdmdh.1324713247Historical Documents and Automatic Text Recognition: IntroductionAriane Pinche0https://orcid.org/0000-0002-7843-5050Peter Stokes1https://orcid.org/0000-0002-9060-9340Histoire, Archéologie et Littératures des mondes chrétiens et musulmans médiévauxÉcole Pratique des Hautes ÉtudesWith this special issue of the Journal of Data Mining and Digital Humanities (JDMDH), we bringtogether in one single volume several experiments, projects and reflections related to automatic textrecognition applied to historical documents. More and more research projects now include automatic text acquisition in their data processing chain, and this is true not only for projects focussed on Digital or Computational Humanities but increasingly also for those that are simply using existing digital tools as the means to an end. The increasing use of this technology has led to an automation of tasks that affects the role of the researcher in the textual production process. This new data-intensive practice makes it urgent to collect and harmonise the corpora necessary for the constitution of training sets, but also to make them available for exploitation. This special issue is therefore an opportunity to present articles combining philological and technical questions to make a scientific assessment of the use of automatic text recognition for ancient documents, its results, its contributions and the new practices induced by its use in the process of editing and exploring texts. We hope that practical aspects will be questioned on this occasion, while raising methodological challenges and its impact on research data.The special issue on Automatic Text Recognition (ATR) is therefore dedicated to providing a comprehensive overview of the use of ATR in the humanities field, particularly concerning historical documents in the early 2020s. This issue presents a fusion of engineering and philological aspects, catering to both beginners and experienced users interested in launching projects with ATR. The collection encompasses a diverse array of approaches, covering topics such as data creation or collection for training generic models, reaching specific objectives, technical and HTR machine architecture, segmentation methods, and image processing.https://jdmdh.episciences.org/13247/pdfatrescriptoriumkrakenhtr-unitedsegmonto[shs.hist]humanities and social sciences/history[info]computer science [cs][info.info-ai]computer science [cs]/artificial intelligence [cs.ai][info.info-lg]computer science [cs]/machine learning [cs.lg][info.info-mo]computer science [cs]/modeling and simulation[shs.hist]humanities and social sciences/history[shs.litt]humanities and social sciences/literature
spellingShingle Ariane Pinche
Peter Stokes
Historical Documents and Automatic Text Recognition: Introduction
Journal of Data Mining and Digital Humanities
atr
escriptorium
kraken
htr-united
segmonto
[shs.hist]humanities and social sciences/history
[info]computer science [cs]
[info.info-ai]computer science [cs]/artificial intelligence [cs.ai]
[info.info-lg]computer science [cs]/machine learning [cs.lg]
[info.info-mo]computer science [cs]/modeling and simulation
[shs.hist]humanities and social sciences/history
[shs.litt]humanities and social sciences/literature
title Historical Documents and Automatic Text Recognition: Introduction
title_full Historical Documents and Automatic Text Recognition: Introduction
title_fullStr Historical Documents and Automatic Text Recognition: Introduction
title_full_unstemmed Historical Documents and Automatic Text Recognition: Introduction
title_short Historical Documents and Automatic Text Recognition: Introduction
title_sort historical documents and automatic text recognition introduction
topic atr
escriptorium
kraken
htr-united
segmonto
[shs.hist]humanities and social sciences/history
[info]computer science [cs]
[info.info-ai]computer science [cs]/artificial intelligence [cs.ai]
[info.info-lg]computer science [cs]/machine learning [cs.lg]
[info.info-mo]computer science [cs]/modeling and simulation
[shs.hist]humanities and social sciences/history
[shs.litt]humanities and social sciences/literature
url https://jdmdh.episciences.org/13247/pdf
work_keys_str_mv AT arianepinche historicaldocumentsandautomatictextrecognitionintroduction
AT peterstokes historicaldocumentsandautomatictextrecognitionintroduction