French vital records data gathering and analysis through image processing and machine learning algorithms
Vital records are rich of meaningful historical data concerning city as well as countryside inhabitants that can be used, among others, to study former populations and then reveal the social, economic and demographic characteristics of those populations. However, these studies encounter a main diffi...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nicolas Turenne
2021-07-01
|
Series: | Journal of Data Mining and Digital Humanities |
Subjects: | |
Online Access: | https://jdmdh.episciences.org/7327/pdf |
_version_ | 1797269945021628416 |
---|---|
author | Cyprien Plateau-Holleville Enzo Bonnot Franck Gechter Laurent Heyberger |
author_facet | Cyprien Plateau-Holleville Enzo Bonnot Franck Gechter Laurent Heyberger |
author_sort | Cyprien Plateau-Holleville |
collection | DOAJ |
description | Vital records are rich of meaningful historical data concerning city as well as countryside inhabitants that can be used, among others, to study former populations and then reveal the social, economic and demographic characteristics of those populations. However, these studies encounter a main difficulty for collecting the data needed since most of these records are scanned documents that need a manual transcription step in order to gather all the data and start exploiting it from a historical point of view. This step consequently slows down the historical research and is an obstacle to a better knowledge of the population habits depending on their social conditions. Therefore in this paper, we present a modular and self-sufficient analysis pipeline using state-of-the-art algorithms mostly regardless of the document layout that aims to automate this data extraction process. |
first_indexed | 2024-03-11T21:04:39Z |
format | Article |
id | doaj.art-76278c95fabd41729e099ba0dcaa7a67 |
institution | Directory Open Access Journal |
issn | 2416-5999 |
language | English |
last_indexed | 2024-04-25T01:56:26Z |
publishDate | 2021-07-01 |
publisher | Nicolas Turenne |
record_format | Article |
series | Journal of Data Mining and Digital Humanities |
spelling | doaj.art-76278c95fabd41729e099ba0dcaa7a672024-03-07T16:54:07ZengNicolas TurenneJournal of Data Mining and Digital Humanities2416-59992021-07-01202110.46298/jdmdh.73277327French vital records data gathering and analysis through image processing and machine learning algorithmsCyprien Plateau-Holleville0Enzo Bonnot1Franck Gechter2Laurent Heyberger3Université de Technologie de Belfort-MontbeliardUniversité de Technologie de Belfort-MontbeliardConnaissance et Intelligence Artificielle Distribuées [Dijon]Franche-Comté Électronique Mécanique, Thermique et Optique - Sciences et Technologies (UMR 6174)Vital records are rich of meaningful historical data concerning city as well as countryside inhabitants that can be used, among others, to study former populations and then reveal the social, economic and demographic characteristics of those populations. However, these studies encounter a main difficulty for collecting the data needed since most of these records are scanned documents that need a manual transcription step in order to gather all the data and start exploiting it from a historical point of view. This step consequently slows down the historical research and is an obstacle to a better knowledge of the population habits depending on their social conditions. Therefore in this paper, we present a modular and self-sufficient analysis pipeline using state-of-the-art algorithms mostly regardless of the document layout that aims to automate this data extraction process.https://jdmdh.episciences.org/7327/pdfhandwritten text recognitionmachine learningoptical character recognitionhistorical data[info.info-cv]computer science [cs]/computer vision and pattern recognition [cs.cv][shs.hist]humanities and social sciences/history[info.info-ai]computer science [cs]/artificial intelligence [cs.ai] |
spellingShingle | Cyprien Plateau-Holleville Enzo Bonnot Franck Gechter Laurent Heyberger French vital records data gathering and analysis through image processing and machine learning algorithms Journal of Data Mining and Digital Humanities handwritten text recognition machine learning optical character recognition historical data [info.info-cv]computer science [cs]/computer vision and pattern recognition [cs.cv] [shs.hist]humanities and social sciences/history [info.info-ai]computer science [cs]/artificial intelligence [cs.ai] |
title | French vital records data gathering and analysis through image processing and machine learning algorithms |
title_full | French vital records data gathering and analysis through image processing and machine learning algorithms |
title_fullStr | French vital records data gathering and analysis through image processing and machine learning algorithms |
title_full_unstemmed | French vital records data gathering and analysis through image processing and machine learning algorithms |
title_short | French vital records data gathering and analysis through image processing and machine learning algorithms |
title_sort | french vital records data gathering and analysis through image processing and machine learning algorithms |
topic | handwritten text recognition machine learning optical character recognition historical data [info.info-cv]computer science [cs]/computer vision and pattern recognition [cs.cv] [shs.hist]humanities and social sciences/history [info.info-ai]computer science [cs]/artificial intelligence [cs.ai] |
url | https://jdmdh.episciences.org/7327/pdf |
work_keys_str_mv | AT cyprienplateauholleville frenchvitalrecordsdatagatheringandanalysisthroughimageprocessingandmachinelearningalgorithms AT enzobonnot frenchvitalrecordsdatagatheringandanalysisthroughimageprocessingandmachinelearningalgorithms AT franckgechter frenchvitalrecordsdatagatheringandanalysisthroughimageprocessingandmachinelearningalgorithms AT laurentheyberger frenchvitalrecordsdatagatheringandanalysisthroughimageprocessingandmachinelearningalgorithms |