Computational morphology systems for Zulu – a comparison

The morphological analysis of Bantu languages, particularly for those with a conjunctive orthography such as Zulu, is crucial not only for the purposes of accurate corpus searches for Bantu linguists, but also as a basic enabling application that facilitates the development of more advanced tools a...

Full description

Bibliographic Details
Main Author: Sonja Bosch
Format: Article
Language:English
Published: Nordic Africa Research Network 2020-10-01
Series:Nordic Journal of African Studies
Subjects:
Online Access:https://www.njas.fi/njas/article/view/548
_version_ 1827827384532860928
author Sonja Bosch
author_facet Sonja Bosch
author_sort Sonja Bosch
collection DOAJ
description The morphological analysis of Bantu languages, particularly for those with a conjunctive orthography such as Zulu, is crucial not only for the purposes of accurate corpus searches for Bantu linguists, but also as a basic enabling application that facilitates the development of more advanced tools and practical language processing applications, such as tokenising, disambiguation, part-of-speech tagging, parsing and machine translation. In this article, a comparison is made between four freely available computational morphology systems for Zulu, namely isiZulu.net, a Zulu–English online dictionary that also offers morphological analysis; ZulMorph, a finite-state morphological analyser for Zulu, currently available as a finite-state morphology demo; an open source morphological decomposer (available as modules and data) listed as the NCHLT (National Centre for HLT) IsiZulu Morphological Decomposer; and CHIPMUNK, a morphological segmenter and stemmer that contains components for modelling Zulu morphotactics. Criteria that are considered for the purposes of this comparison are, among others, accessibility and lookup capacity, embedded lexicons, degree of granularity of morphological analysis or decomposition, and also the documentation of tagsets used for purposes of analysis. Furthermore, the results of an evaluation based on recall and precision are presented. Against this background, this first comparison of four available Zulu computational morphology systems will be presented, based on output examples of a broad range of word categories with varying morphological complexity extracted by means of random sampling from the freely available Leipzig Wortschatz Collection corpus.
first_indexed 2024-03-12T03:26:03Z
format Article
id doaj.art-86d3f69f3f3d40c98affd7b6817fcf51
institution Directory Open Access Journal
issn 1459-9465
language English
last_indexed 2024-03-12T03:26:03Z
publishDate 2020-10-01
publisher Nordic Africa Research Network
record_format Article
series Nordic Journal of African Studies
spelling doaj.art-86d3f69f3f3d40c98affd7b6817fcf512023-09-03T13:37:56ZengNordic Africa Research NetworkNordic Journal of African Studies1459-94652020-10-0129310.53228/njas.v29i3.548Computational morphology systems for Zulu – a comparisonSonja Bosch0Department of African Languages, University of South Africa (UNISA) The morphological analysis of Bantu languages, particularly for those with a conjunctive orthography such as Zulu, is crucial not only for the purposes of accurate corpus searches for Bantu linguists, but also as a basic enabling application that facilitates the development of more advanced tools and practical language processing applications, such as tokenising, disambiguation, part-of-speech tagging, parsing and machine translation. In this article, a comparison is made between four freely available computational morphology systems for Zulu, namely isiZulu.net, a Zulu–English online dictionary that also offers morphological analysis; ZulMorph, a finite-state morphological analyser for Zulu, currently available as a finite-state morphology demo; an open source morphological decomposer (available as modules and data) listed as the NCHLT (National Centre for HLT) IsiZulu Morphological Decomposer; and CHIPMUNK, a morphological segmenter and stemmer that contains components for modelling Zulu morphotactics. Criteria that are considered for the purposes of this comparison are, among others, accessibility and lookup capacity, embedded lexicons, degree of granularity of morphological analysis or decomposition, and also the documentation of tagsets used for purposes of analysis. Furthermore, the results of an evaluation based on recall and precision are presented. Against this background, this first comparison of four available Zulu computational morphology systems will be presented, based on output examples of a broad range of word categories with varying morphological complexity extracted by means of random sampling from the freely available Leipzig Wortschatz Collection corpus. https://www.njas.fi/njas/article/view/548computational morphology systemsmorphological analysermorphological decomposersegmentationZulu morphology
spellingShingle Sonja Bosch
Computational morphology systems for Zulu – a comparison
Nordic Journal of African Studies
computational morphology systems
morphological analyser
morphological decomposer
segmentation
Zulu morphology
title Computational morphology systems for Zulu – a comparison
title_full Computational morphology systems for Zulu – a comparison
title_fullStr Computational morphology systems for Zulu – a comparison
title_full_unstemmed Computational morphology systems for Zulu – a comparison
title_short Computational morphology systems for Zulu – a comparison
title_sort computational morphology systems for zulu a comparison
topic computational morphology systems
morphological analyser
morphological decomposer
segmentation
Zulu morphology
url https://www.njas.fi/njas/article/view/548
work_keys_str_mv AT sonjabosch computationalmorphologysystemsforzuluacomparison