Computational morphology systems for Zulu – a comparison
The morphological analysis of Bantu languages, particularly for those with a conjunctive orthography such as Zulu, is crucial not only for the purposes of accurate corpus searches for Bantu linguists, but also as a basic enabling application that facilitates the development of more advanced tools a...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
Nordic Africa Research Network
2020-10-01
|
Series: | Nordic Journal of African Studies |
Subjects: | |
Online Access: | https://www.njas.fi/njas/article/view/548 |
_version_ | 1827827384532860928 |
---|---|
author | Sonja Bosch |
author_facet | Sonja Bosch |
author_sort | Sonja Bosch |
collection | DOAJ |
description |
The morphological analysis of Bantu languages, particularly for those with a conjunctive orthography such as Zulu, is crucial not only for the purposes of accurate corpus searches for Bantu linguists, but also as a basic enabling application that facilitates the development of more advanced tools and practical language processing applications, such as tokenising, disambiguation, part-of-speech tagging, parsing and machine translation. In this article, a comparison is made between four freely available computational morphology systems for Zulu, namely isiZulu.net, a Zulu–English online dictionary that also offers morphological analysis; ZulMorph, a finite-state morphological analyser for Zulu, currently available as a finite-state morphology demo; an open source morphological decomposer (available as modules and data) listed as the NCHLT (National Centre for HLT) IsiZulu Morphological Decomposer; and CHIPMUNK, a morphological segmenter and stemmer that contains components for modelling Zulu morphotactics. Criteria that are considered for the purposes of this comparison are, among others, accessibility and lookup capacity, embedded lexicons, degree of granularity of morphological analysis or decomposition, and also the documentation of tagsets used for purposes of analysis. Furthermore, the results of an evaluation based on recall and precision are presented. Against this background, this first comparison of four available Zulu computational morphology systems will be presented, based on output examples of a broad range of word categories with varying morphological complexity extracted by means of random sampling from the freely available Leipzig Wortschatz Collection corpus.
|
first_indexed | 2024-03-12T03:26:03Z |
format | Article |
id | doaj.art-86d3f69f3f3d40c98affd7b6817fcf51 |
institution | Directory Open Access Journal |
issn | 1459-9465 |
language | English |
last_indexed | 2024-03-12T03:26:03Z |
publishDate | 2020-10-01 |
publisher | Nordic Africa Research Network |
record_format | Article |
series | Nordic Journal of African Studies |
spelling | doaj.art-86d3f69f3f3d40c98affd7b6817fcf512023-09-03T13:37:56ZengNordic Africa Research NetworkNordic Journal of African Studies1459-94652020-10-0129310.53228/njas.v29i3.548Computational morphology systems for Zulu – a comparisonSonja Bosch0Department of African Languages, University of South Africa (UNISA) The morphological analysis of Bantu languages, particularly for those with a conjunctive orthography such as Zulu, is crucial not only for the purposes of accurate corpus searches for Bantu linguists, but also as a basic enabling application that facilitates the development of more advanced tools and practical language processing applications, such as tokenising, disambiguation, part-of-speech tagging, parsing and machine translation. In this article, a comparison is made between four freely available computational morphology systems for Zulu, namely isiZulu.net, a Zulu–English online dictionary that also offers morphological analysis; ZulMorph, a finite-state morphological analyser for Zulu, currently available as a finite-state morphology demo; an open source morphological decomposer (available as modules and data) listed as the NCHLT (National Centre for HLT) IsiZulu Morphological Decomposer; and CHIPMUNK, a morphological segmenter and stemmer that contains components for modelling Zulu morphotactics. Criteria that are considered for the purposes of this comparison are, among others, accessibility and lookup capacity, embedded lexicons, degree of granularity of morphological analysis or decomposition, and also the documentation of tagsets used for purposes of analysis. Furthermore, the results of an evaluation based on recall and precision are presented. Against this background, this first comparison of four available Zulu computational morphology systems will be presented, based on output examples of a broad range of word categories with varying morphological complexity extracted by means of random sampling from the freely available Leipzig Wortschatz Collection corpus. https://www.njas.fi/njas/article/view/548computational morphology systemsmorphological analysermorphological decomposersegmentationZulu morphology |
spellingShingle | Sonja Bosch Computational morphology systems for Zulu – a comparison Nordic Journal of African Studies computational morphology systems morphological analyser morphological decomposer segmentation Zulu morphology |
title | Computational morphology systems for Zulu – a comparison |
title_full | Computational morphology systems for Zulu – a comparison |
title_fullStr | Computational morphology systems for Zulu – a comparison |
title_full_unstemmed | Computational morphology systems for Zulu – a comparison |
title_short | Computational morphology systems for Zulu – a comparison |
title_sort | computational morphology systems for zulu a comparison |
topic | computational morphology systems morphological analyser morphological decomposer segmentation Zulu morphology |
url | https://www.njas.fi/njas/article/view/548 |
work_keys_str_mv | AT sonjabosch computationalmorphologysystemsforzuluacomparison |