A Named Entity-Annotated Corpus of 19th Century Classical Commentaries
We release a multilingual named entity (NE) corpus of 19th century commentaries to Sophocles’ Ajax. Selected commentaries are written in English, German and French, but are also replete with Latin and Greek quotes. Bibliographic entities were annotated along traditional named entities following our...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Ubiquity Press
2024-01-01
|
Series: | Journal of Open Humanities Data |
Subjects: | |
Online Access: | https://account.openhumanitiesdata.metajnl.com/index.php/up-j-johd/article/view/150 |
_version_ | 1797315794531516416 |
---|---|
author | Matteo Romanello Sven Najem-Meyer |
author_facet | Matteo Romanello Sven Najem-Meyer |
author_sort | Matteo Romanello |
collection | DOAJ |
description | We release a multilingual named entity (NE) corpus of 19th century commentaries to Sophocles’ Ajax. Selected commentaries are written in English, German and French, but are also replete with Latin and Greek quotes. Bibliographic entities were annotated along traditional named entities following our guidelines (Romanello & Najem-Meyer, 2022). The corpus contains about 300 annotated pages, 111,216 tokens and 7,334 entity mentions and was featured in the HIPE-2022 shared task. Although named entity recognition (NER) showed reassuring results, optical character recognition (OCR) mistakes and extensive use of abbreviation kept entity linking (EL) a challenging task. With such characteristics, this corpus offers an excellent way to assess the adaptability of information extraction systems to noisy, domain-specific multilingual and multiscript environments. |
first_indexed | 2024-03-08T03:09:00Z |
format | Article |
id | doaj.art-87a29015dad14af2ba0da81f7fec182c |
institution | Directory Open Access Journal |
issn | 2059-481X |
language | English |
last_indexed | 2024-03-08T03:09:00Z |
publishDate | 2024-01-01 |
publisher | Ubiquity Press |
record_format | Article |
series | Journal of Open Humanities Data |
spelling | doaj.art-87a29015dad14af2ba0da81f7fec182c2024-02-13T07:38:06ZengUbiquity PressJournal of Open Humanities Data2059-481X2024-01-01101110.5334/johd.150150A Named Entity-Annotated Corpus of 19th Century Classical CommentariesMatteo Romanello0https://orcid.org/0000-0002-7406-6286Sven Najem-Meyer1https://orcid.org/0000-0002-3661-4579Institute of Archeology and Classical Studies, University of Lausanne, LausanneDigital Humanities Laboratory, Swiss Federal Institute of Technology Lausanne, LausanneWe release a multilingual named entity (NE) corpus of 19th century commentaries to Sophocles’ Ajax. Selected commentaries are written in English, German and French, but are also replete with Latin and Greek quotes. Bibliographic entities were annotated along traditional named entities following our guidelines (Romanello & Najem-Meyer, 2022). The corpus contains about 300 annotated pages, 111,216 tokens and 7,334 entity mentions and was featured in the HIPE-2022 shared task. Although named entity recognition (NER) showed reassuring results, optical character recognition (OCR) mistakes and extensive use of abbreviation kept entity linking (EL) a challenging task. With such characteristics, this corpus offers an excellent way to assess the adaptability of information extraction systems to noisy, domain-specific multilingual and multiscript environments.https://account.openhumanitiesdata.metajnl.com/index.php/up-j-johd/article/view/150historical commentariesclassicsnamed entity recognitionentity linkingbibliographic reference extraction |
spellingShingle | Matteo Romanello Sven Najem-Meyer A Named Entity-Annotated Corpus of 19th Century Classical Commentaries Journal of Open Humanities Data historical commentaries classics named entity recognition entity linking bibliographic reference extraction |
title | A Named Entity-Annotated Corpus of 19th Century Classical Commentaries |
title_full | A Named Entity-Annotated Corpus of 19th Century Classical Commentaries |
title_fullStr | A Named Entity-Annotated Corpus of 19th Century Classical Commentaries |
title_full_unstemmed | A Named Entity-Annotated Corpus of 19th Century Classical Commentaries |
title_short | A Named Entity-Annotated Corpus of 19th Century Classical Commentaries |
title_sort | named entity annotated corpus of 19th century classical commentaries |
topic | historical commentaries classics named entity recognition entity linking bibliographic reference extraction |
url | https://account.openhumanitiesdata.metajnl.com/index.php/up-j-johd/article/view/150 |
work_keys_str_mv | AT matteoromanello anamedentityannotatedcorpusof19thcenturyclassicalcommentaries AT svennajemmeyer anamedentityannotatedcorpusof19thcenturyclassicalcommentaries AT matteoromanello namedentityannotatedcorpusof19thcenturyclassicalcommentaries AT svennajemmeyer namedentityannotatedcorpusof19thcenturyclassicalcommentaries |