Experimental Bootstrapping of Morphological Analysers for Nguni Languages
This paper addresses the experimental bootstrapping of the development of broad-coverage finite-state morphological analysers for Xhosa, Swati and (Southern) Ndebele by using an existing prototype of a morphological analyser for Zulu. These languages are both morphologically complex and resource-sc...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nordic Africa Research Network
2008-06-01
|
Series: | Nordic Journal of African Studies |
Online Access: | https://www.njas.fi/njas/article/view/237 |
_version_ | 1827827632277815296 |
---|---|
author | Sonja Bosch Laurette Pretorius Axel Fleisch |
author_facet | Sonja Bosch Laurette Pretorius Axel Fleisch |
author_sort | Sonja Bosch |
collection | DOAJ |
description |
This paper addresses the experimental bootstrapping of the development of broad-coverage finite-state morphological analysers for Xhosa, Swati and (Southern) Ndebele by using an existing prototype of a morphological analyser for Zulu. These languages are both morphologically complex and resource-scarce. The research question is whether bootstrapping is feasible across the language boundaries between these closely related varieties. The objective is an assessment of the recognition rates yielded by the Zulu morphological analyser for the three related languages. The strategy is to use bootstrapping techniques that consist of the following steps: applying the analyser to corpus data from all languages, identifying (types of) failures, and implementing the respective changes in the analyser. The results show that the high degree of shared typological properties and formal similarities among the Nguni varieties warrants a modular bootstrapping approach. Word forms in these languages that were recognized by the Zulu analyser were mostly adequately analysed. Therefore, the focus lies on providing the necessary adaptations based on an analysis of the failure output for each language. As a result, the development of analysers for Xhosa, Swati and Ndebele is considerably faster than the creation of the Zulu prototype. The paper concludes with comments on the feasibility of the experiment, and the results of the evaluation.
|
first_indexed | 2024-03-12T03:35:50Z |
format | Article |
id | doaj.art-87ee4bbf7bd547fc8f8696e52503ac34 |
institution | Directory Open Access Journal |
issn | 1459-9465 |
language | English |
last_indexed | 2024-03-12T03:35:50Z |
publishDate | 2008-06-01 |
publisher | Nordic Africa Research Network |
record_format | Article |
series | Nordic Journal of African Studies |
spelling | doaj.art-87ee4bbf7bd547fc8f8696e52503ac342023-09-03T13:15:33ZengNordic Africa Research NetworkNordic Journal of African Studies1459-94652008-06-0117210.53228/njas.v17i2.237Experimental Bootstrapping of Morphological Analysers for Nguni LanguagesSonja Bosch0Laurette Pretorius1Axel Fleisch2University of South AfricaUniversity of South Africa and Meraka Institute, CSIRUniversity of South Africa and University of Helsinki This paper addresses the experimental bootstrapping of the development of broad-coverage finite-state morphological analysers for Xhosa, Swati and (Southern) Ndebele by using an existing prototype of a morphological analyser for Zulu. These languages are both morphologically complex and resource-scarce. The research question is whether bootstrapping is feasible across the language boundaries between these closely related varieties. The objective is an assessment of the recognition rates yielded by the Zulu morphological analyser for the three related languages. The strategy is to use bootstrapping techniques that consist of the following steps: applying the analyser to corpus data from all languages, identifying (types of) failures, and implementing the respective changes in the analyser. The results show that the high degree of shared typological properties and formal similarities among the Nguni varieties warrants a modular bootstrapping approach. Word forms in these languages that were recognized by the Zulu analyser were mostly adequately analysed. Therefore, the focus lies on providing the necessary adaptations based on an analysis of the failure output for each language. As a result, the development of analysers for Xhosa, Swati and Ndebele is considerably faster than the creation of the Zulu prototype. The paper concludes with comments on the feasibility of the experiment, and the results of the evaluation. https://www.njas.fi/njas/article/view/237 |
spellingShingle | Sonja Bosch Laurette Pretorius Axel Fleisch Experimental Bootstrapping of Morphological Analysers for Nguni Languages Nordic Journal of African Studies |
title | Experimental Bootstrapping of Morphological Analysers for Nguni Languages |
title_full | Experimental Bootstrapping of Morphological Analysers for Nguni Languages |
title_fullStr | Experimental Bootstrapping of Morphological Analysers for Nguni Languages |
title_full_unstemmed | Experimental Bootstrapping of Morphological Analysers for Nguni Languages |
title_short | Experimental Bootstrapping of Morphological Analysers for Nguni Languages |
title_sort | experimental bootstrapping of morphological analysers for nguni languages |
url | https://www.njas.fi/njas/article/view/237 |
work_keys_str_mv | AT sonjabosch experimentalbootstrappingofmorphologicalanalysersforngunilanguages AT laurettepretorius experimentalbootstrappingofmorphologicalanalysersforngunilanguages AT axelfleisch experimentalbootstrappingofmorphologicalanalysersforngunilanguages |