Text Recognition for Nepalese Manuscripts in Pracalit Script
This dataset is a model for handwritten text recognition (HTR) of Sanskrit and Newar Nepalese manuscripts in Pracalit script. This paper introduces the state of the field in Newar literature, Newar manuscripts, and HTR engines. It explains our methodology for developing the requisite ground truth co...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Ubiquity Press
2022-11-01
|
Series: | Journal of Open Humanities Data |
Subjects: | |
Online Access: | https://openhumanitiesdata.metajnl.com/articles/90 |
_version_ | 1811178214431653888 |
---|---|
author | Alexander James O’Neill Nathan Hill |
author_facet | Alexander James O’Neill Nathan Hill |
author_sort | Alexander James O’Neill |
collection | DOAJ |
description | This dataset is a model for handwritten text recognition (HTR) of Sanskrit and Newar Nepalese manuscripts in Pracalit script. This paper introduces the state of the field in Newar literature, Newar manuscripts, and HTR engines. It explains our methodology for developing the requisite ground truth consisting of manuscript images and corresponding transcriptions, training our model with a PyLAia engine, and this model’s limitations. This dataset shared on Zenodo can be used by anyone working with manuscripts in Pracalit script, which will benefit the fields of Indology and Newar studies, as well as historical and linguistic analysis. |
first_indexed | 2024-04-11T06:14:42Z |
format | Article |
id | doaj.art-e9a271569e0245f696b333972c476a12 |
institution | Directory Open Access Journal |
issn | 2059-481X |
language | English |
last_indexed | 2024-04-11T06:14:42Z |
publishDate | 2022-11-01 |
publisher | Ubiquity Press |
record_format | Article |
series | Journal of Open Humanities Data |
spelling | doaj.art-e9a271569e0245f696b333972c476a122022-12-22T04:41:06ZengUbiquity PressJournal of Open Humanities Data2059-481X2022-11-01810.5334/johd.9077Text Recognition for Nepalese Manuscripts in Pracalit ScriptAlexander James O’Neill0Nathan Hill1Department of East Asian Languages and Cultures, SOAS University of London, LondonDepartment of East Asian Languages and Cultures, SOAS University of London, London, UK; Trinity Centre for Asian Studies, Trinity College Dublin, DublinThis dataset is a model for handwritten text recognition (HTR) of Sanskrit and Newar Nepalese manuscripts in Pracalit script. This paper introduces the state of the field in Newar literature, Newar manuscripts, and HTR engines. It explains our methodology for developing the requisite ground truth consisting of manuscript images and corresponding transcriptions, training our model with a PyLAia engine, and this model’s limitations. This dataset shared on Zenodo can be used by anyone working with manuscripts in Pracalit script, which will benefit the fields of Indology and Newar studies, as well as historical and linguistic analysis.https://openhumanitiesdata.metajnl.com/articles/90handwritten text recognitionpylaiatranskribussanskritnewarmanuscripts |
spellingShingle | Alexander James O’Neill Nathan Hill Text Recognition for Nepalese Manuscripts in Pracalit Script Journal of Open Humanities Data handwritten text recognition pylaia transkribus sanskrit newar manuscripts |
title | Text Recognition for Nepalese Manuscripts in Pracalit Script |
title_full | Text Recognition for Nepalese Manuscripts in Pracalit Script |
title_fullStr | Text Recognition for Nepalese Manuscripts in Pracalit Script |
title_full_unstemmed | Text Recognition for Nepalese Manuscripts in Pracalit Script |
title_short | Text Recognition for Nepalese Manuscripts in Pracalit Script |
title_sort | text recognition for nepalese manuscripts in pracalit script |
topic | handwritten text recognition pylaia transkribus sanskrit newar manuscripts |
url | https://openhumanitiesdata.metajnl.com/articles/90 |
work_keys_str_mv | AT alexanderjamesoneill textrecognitionfornepalesemanuscriptsinpracalitscript AT nathanhill textrecognitionfornepalesemanuscriptsinpracalitscript |