French vital records data gathering and analysis through image processing and machine learning algorithms

Vital records are rich of meaningful historical data concerning city as well as countryside inhabitants that can be used, among others, to study former populations and then reveal the social, economic and demographic characteristics of those populations. However, these studies encounter a main diffi...

Full description

Bibliographic Details
Main Authors: Cyprien Plateau-Holleville, Enzo Bonnot, Franck Gechter, Laurent Heyberger
Format: Article
Language:English
Published: Nicolas Turenne 2021-07-01
Series:Journal of Data Mining and Digital Humanities
Subjects:
Online Access:https://jdmdh.episciences.org/7327/pdf
_version_ 1797269945021628416
author Cyprien Plateau-Holleville
Enzo Bonnot
Franck Gechter
Laurent Heyberger
author_facet Cyprien Plateau-Holleville
Enzo Bonnot
Franck Gechter
Laurent Heyberger
author_sort Cyprien Plateau-Holleville
collection DOAJ
description Vital records are rich of meaningful historical data concerning city as well as countryside inhabitants that can be used, among others, to study former populations and then reveal the social, economic and demographic characteristics of those populations. However, these studies encounter a main difficulty for collecting the data needed since most of these records are scanned documents that need a manual transcription step in order to gather all the data and start exploiting it from a historical point of view. This step consequently slows down the historical research and is an obstacle to a better knowledge of the population habits depending on their social conditions. Therefore in this paper, we present a modular and self-sufficient analysis pipeline using state-of-the-art algorithms mostly regardless of the document layout that aims to automate this data extraction process.
first_indexed 2024-03-11T21:04:39Z
format Article
id doaj.art-76278c95fabd41729e099ba0dcaa7a67
institution Directory Open Access Journal
issn 2416-5999
language English
last_indexed 2024-04-25T01:56:26Z
publishDate 2021-07-01
publisher Nicolas Turenne
record_format Article
series Journal of Data Mining and Digital Humanities
spelling doaj.art-76278c95fabd41729e099ba0dcaa7a672024-03-07T16:54:07ZengNicolas TurenneJournal of Data Mining and Digital Humanities2416-59992021-07-01202110.46298/jdmdh.73277327French vital records data gathering and analysis through image processing and machine learning algorithmsCyprien Plateau-Holleville0Enzo Bonnot1Franck Gechter2Laurent Heyberger3Université de Technologie de Belfort-MontbeliardUniversité de Technologie de Belfort-MontbeliardConnaissance et Intelligence Artificielle Distribuées [Dijon]Franche-Comté Électronique Mécanique, Thermique et Optique - Sciences et Technologies (UMR 6174)Vital records are rich of meaningful historical data concerning city as well as countryside inhabitants that can be used, among others, to study former populations and then reveal the social, economic and demographic characteristics of those populations. However, these studies encounter a main difficulty for collecting the data needed since most of these records are scanned documents that need a manual transcription step in order to gather all the data and start exploiting it from a historical point of view. This step consequently slows down the historical research and is an obstacle to a better knowledge of the population habits depending on their social conditions. Therefore in this paper, we present a modular and self-sufficient analysis pipeline using state-of-the-art algorithms mostly regardless of the document layout that aims to automate this data extraction process.https://jdmdh.episciences.org/7327/pdfhandwritten text recognitionmachine learningoptical character recognitionhistorical data[info.info-cv]computer science [cs]/computer vision and pattern recognition [cs.cv][shs.hist]humanities and social sciences/history[info.info-ai]computer science [cs]/artificial intelligence [cs.ai]
spellingShingle Cyprien Plateau-Holleville
Enzo Bonnot
Franck Gechter
Laurent Heyberger
French vital records data gathering and analysis through image processing and machine learning algorithms
Journal of Data Mining and Digital Humanities
handwritten text recognition
machine learning
optical character recognition
historical data
[info.info-cv]computer science [cs]/computer vision and pattern recognition [cs.cv]
[shs.hist]humanities and social sciences/history
[info.info-ai]computer science [cs]/artificial intelligence [cs.ai]
title French vital records data gathering and analysis through image processing and machine learning algorithms
title_full French vital records data gathering and analysis through image processing and machine learning algorithms
title_fullStr French vital records data gathering and analysis through image processing and machine learning algorithms
title_full_unstemmed French vital records data gathering and analysis through image processing and machine learning algorithms
title_short French vital records data gathering and analysis through image processing and machine learning algorithms
title_sort french vital records data gathering and analysis through image processing and machine learning algorithms
topic handwritten text recognition
machine learning
optical character recognition
historical data
[info.info-cv]computer science [cs]/computer vision and pattern recognition [cs.cv]
[shs.hist]humanities and social sciences/history
[info.info-ai]computer science [cs]/artificial intelligence [cs.ai]
url https://jdmdh.episciences.org/7327/pdf
work_keys_str_mv AT cyprienplateauholleville frenchvitalrecordsdatagatheringandanalysisthroughimageprocessingandmachinelearningalgorithms
AT enzobonnot frenchvitalrecordsdatagatheringandanalysisthroughimageprocessingandmachinelearningalgorithms
AT franckgechter frenchvitalrecordsdatagatheringandanalysisthroughimageprocessingandmachinelearningalgorithms
AT laurentheyberger frenchvitalrecordsdatagatheringandanalysisthroughimageprocessingandmachinelearningalgorithms