Experimental Bootstrapping of Morphological Analysers for Nguni Languages

This paper addresses the experimental bootstrapping of the development of broad-coverage finite-state morphological analysers for Xhosa, Swati and (Southern) Ndebele by using an existing prototype of a morphological analyser for Zulu. These languages are both morphologically complex and resource-sc...

Full description

Bibliographic Details
Main Authors: Sonja Bosch, Laurette Pretorius, Axel Fleisch
Format: Article
Language:English
Published: Nordic Africa Research Network 2008-06-01
Series:Nordic Journal of African Studies
Online Access:https://www.njas.fi/njas/article/view/237
_version_ 1827827632277815296
author Sonja Bosch
Laurette Pretorius
Axel Fleisch
author_facet Sonja Bosch
Laurette Pretorius
Axel Fleisch
author_sort Sonja Bosch
collection DOAJ
description This paper addresses the experimental bootstrapping of the development of broad-coverage finite-state morphological analysers for Xhosa, Swati and (Southern) Ndebele by using an existing prototype of a morphological analyser for Zulu. These languages are both morphologically complex and resource-scarce. The research question is whether bootstrapping is feasible across the language boundaries between these closely related varieties. The objective is an assessment of the recognition rates yielded by the Zulu morphological analyser for the three related languages. The strategy is to use bootstrapping techniques that consist of the following steps: applying the analyser to corpus data from all languages, identifying (types of) failures, and implementing the respective changes in the analyser. The results show that the high degree of shared typological properties and formal similarities among the Nguni varieties warrants a modular bootstrapping approach. Word forms in these languages that were recognized by the Zulu analyser were mostly adequately analysed. Therefore, the focus lies on providing the necessary adaptations based on an analysis of the failure output for each language. As a result, the development of analysers for Xhosa, Swati and Ndebele is considerably faster than the creation of the Zulu prototype. The paper concludes with comments on the feasibility of the experiment, and the results of the evaluation.
first_indexed 2024-03-12T03:35:50Z
format Article
id doaj.art-87ee4bbf7bd547fc8f8696e52503ac34
institution Directory Open Access Journal
issn 1459-9465
language English
last_indexed 2024-03-12T03:35:50Z
publishDate 2008-06-01
publisher Nordic Africa Research Network
record_format Article
series Nordic Journal of African Studies
spelling doaj.art-87ee4bbf7bd547fc8f8696e52503ac342023-09-03T13:15:33ZengNordic Africa Research NetworkNordic Journal of African Studies1459-94652008-06-0117210.53228/njas.v17i2.237Experimental Bootstrapping of Morphological Analysers for Nguni LanguagesSonja Bosch0Laurette Pretorius1Axel Fleisch2University of South AfricaUniversity of South Africa and Meraka Institute, CSIRUniversity of South Africa and University of Helsinki This paper addresses the experimental bootstrapping of the development of broad-coverage finite-state morphological analysers for Xhosa, Swati and (Southern) Ndebele by using an existing prototype of a morphological analyser for Zulu. These languages are both morphologically complex and resource-scarce. The research question is whether bootstrapping is feasible across the language boundaries between these closely related varieties. The objective is an assessment of the recognition rates yielded by the Zulu morphological analyser for the three related languages. The strategy is to use bootstrapping techniques that consist of the following steps: applying the analyser to corpus data from all languages, identifying (types of) failures, and implementing the respective changes in the analyser. The results show that the high degree of shared typological properties and formal similarities among the Nguni varieties warrants a modular bootstrapping approach. Word forms in these languages that were recognized by the Zulu analyser were mostly adequately analysed. Therefore, the focus lies on providing the necessary adaptations based on an analysis of the failure output for each language. As a result, the development of analysers for Xhosa, Swati and Ndebele is considerably faster than the creation of the Zulu prototype. The paper concludes with comments on the feasibility of the experiment, and the results of the evaluation. https://www.njas.fi/njas/article/view/237
spellingShingle Sonja Bosch
Laurette Pretorius
Axel Fleisch
Experimental Bootstrapping of Morphological Analysers for Nguni Languages
Nordic Journal of African Studies
title Experimental Bootstrapping of Morphological Analysers for Nguni Languages
title_full Experimental Bootstrapping of Morphological Analysers for Nguni Languages
title_fullStr Experimental Bootstrapping of Morphological Analysers for Nguni Languages
title_full_unstemmed Experimental Bootstrapping of Morphological Analysers for Nguni Languages
title_short Experimental Bootstrapping of Morphological Analysers for Nguni Languages
title_sort experimental bootstrapping of morphological analysers for nguni languages
url https://www.njas.fi/njas/article/view/237
work_keys_str_mv AT sonjabosch experimentalbootstrappingofmorphologicalanalysersforngunilanguages
AT laurettepretorius experimentalbootstrappingofmorphologicalanalysersforngunilanguages
AT axelfleisch experimentalbootstrappingofmorphologicalanalysersforngunilanguages