Machine learning for ancient languages: a survey

Ancient languages preserve the cultures and histories of the past. However, their study is fraught with difficulties, and experts must tackle a range of challenging text-based tasks, from deciphering lost languages to restoring damaged inscriptions, to determining the authorship of works of literatu...

Full description

Bibliographic Details
Main Authors: Sommerschield, T, Assael, Y, Pavlopoulos, J, Stefanak, V, Senior, A, Dyer, C, Bodel, J, Prag, J, Androutsopoulos, I, Freitas, ND
Format: Journal article
Language:English
Published: MIT Press 2023
_version_ 1797112159170199552
author Sommerschield, T
Assael, Y
Pavlopoulos, J
Stefanak, V
Senior, A
Dyer, C
Bodel, J
Prag, J
Androutsopoulos, I
Freitas, ND
author_facet Sommerschield, T
Assael, Y
Pavlopoulos, J
Stefanak, V
Senior, A
Dyer, C
Bodel, J
Prag, J
Androutsopoulos, I
Freitas, ND
author_sort Sommerschield, T
collection OXFORD
description Ancient languages preserve the cultures and histories of the past. However, their study is fraught with difficulties, and experts must tackle a range of challenging text-based tasks, from deciphering lost languages to restoring damaged inscriptions, to determining the authorship of works of literature. Technological aids have long supported the study of ancient texts, but in recent years advances in artificial intelligence and machine learning have enabled analyses on a scale and in a detail that are reshaping the field of humanities, similarly to how microscopes and telescopes have contributed to the realm of science. This article aims to provide a comprehensive survey of published research using machine learning for the study of ancient texts written in any language, script, and medium, spanning over three and a half millennia of civilizations around the ancient world. To analyze the relevant literature, we introduce a taxonomy of tasks inspired by the steps involved in the study of ancient documents: digitization, restoration, attribution, linguistic analysis, textual criticism, translation, and decipherment. This work offers three major contributions: first, mapping the interdisciplinary field carved out by the synergy between the humanities and machine learning; second, highlighting how active collaboration between specialists from both fields is key to producing impactful and compelling scholarship; third, highlighting promising directions for future work in this field. Thus, this work promotes and supports the continued collaborative impetus between the humanities and machine learning.
first_indexed 2024-03-07T08:21:44Z
format Journal article
id oxford-uuid:7c017a1d-d859-4a6d-abb6-f9151abc9636
institution University of Oxford
language English
last_indexed 2024-03-07T08:21:44Z
publishDate 2023
publisher MIT Press
record_format dspace
spelling oxford-uuid:7c017a1d-d859-4a6d-abb6-f9151abc96362024-01-30T07:06:59ZMachine learning for ancient languages: a surveyJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:7c017a1d-d859-4a6d-abb6-f9151abc9636EnglishSymplectic ElementsMIT Press2023Sommerschield, TAssael, YPavlopoulos, JStefanak, VSenior, ADyer, CBodel, JPrag, JAndroutsopoulos, IFreitas, NDAncient languages preserve the cultures and histories of the past. However, their study is fraught with difficulties, and experts must tackle a range of challenging text-based tasks, from deciphering lost languages to restoring damaged inscriptions, to determining the authorship of works of literature. Technological aids have long supported the study of ancient texts, but in recent years advances in artificial intelligence and machine learning have enabled analyses on a scale and in a detail that are reshaping the field of humanities, similarly to how microscopes and telescopes have contributed to the realm of science. This article aims to provide a comprehensive survey of published research using machine learning for the study of ancient texts written in any language, script, and medium, spanning over three and a half millennia of civilizations around the ancient world. To analyze the relevant literature, we introduce a taxonomy of tasks inspired by the steps involved in the study of ancient documents: digitization, restoration, attribution, linguistic analysis, textual criticism, translation, and decipherment. This work offers three major contributions: first, mapping the interdisciplinary field carved out by the synergy between the humanities and machine learning; second, highlighting how active collaboration between specialists from both fields is key to producing impactful and compelling scholarship; third, highlighting promising directions for future work in this field. Thus, this work promotes and supports the continued collaborative impetus between the humanities and machine learning.
spellingShingle Sommerschield, T
Assael, Y
Pavlopoulos, J
Stefanak, V
Senior, A
Dyer, C
Bodel, J
Prag, J
Androutsopoulos, I
Freitas, ND
Machine learning for ancient languages: a survey
title Machine learning for ancient languages: a survey
title_full Machine learning for ancient languages: a survey
title_fullStr Machine learning for ancient languages: a survey
title_full_unstemmed Machine learning for ancient languages: a survey
title_short Machine learning for ancient languages: a survey
title_sort machine learning for ancient languages a survey
work_keys_str_mv AT sommerschieldt machinelearningforancientlanguagesasurvey
AT assaely machinelearningforancientlanguagesasurvey
AT pavlopoulosj machinelearningforancientlanguagesasurvey
AT stefanakv machinelearningforancientlanguagesasurvey
AT seniora machinelearningforancientlanguagesasurvey
AT dyerc machinelearningforancientlanguagesasurvey
AT bodelj machinelearningforancientlanguagesasurvey
AT pragj machinelearningforancientlanguagesasurvey
AT androutsopoulosi machinelearningforancientlanguagesasurvey
AT freitasnd machinelearningforancientlanguagesasurvey