A method for measuring verb similarity for two closely related languages with application to Zulu and Xhosa

There are limited computational resources for Nguni languages and when improving availability for one of the languages, bootstrapping from a related language’s resources may be a cost-saving approach. This requires the ability to quantify similarity between any two closely related languages so as to...

Full description

Bibliographic Details
Main Authors: Zola Mahlaza, Catharina Maria Keet
Format: Article
Language:English
Published: South African Institute of Computer Scientists and Information Technologists 2019-12-01
Series:South African Computer Journal
Online Access:https://sacj.cs.uct.ac.za/index.php/sacj/article/view/698
_version_ 1818311393123762176
author Zola Mahlaza
Catharina Maria Keet
author_facet Zola Mahlaza
Catharina Maria Keet
author_sort Zola Mahlaza
collection DOAJ
description There are limited computational resources for Nguni languages and when improving availability for one of the languages, bootstrapping from a related language’s resources may be a cost-saving approach. This requires the ability to quantify similarity between any two closely related languages so as to make informed decisions, of which it is unclear how to measure it. We devised a method for quantifying similarity by adapting four extant similar measures, and present a method of quantifying the ratio of verbs that would need phonological conditioning due to consecutive vowels. The verbs selected are those relevant for weather forecasts for Xhosa and Zulu and newly specified as computational grammar rules. The 52 Xhosa and 49 Zulu rules share 42 rules, supporting informal impressions of their similarity. The morphosyntactic similarity reached 59.5% overall on the adapted Driver-Kroeber metric, with past tense rules only at 99.5%. This similarity score is a result of the variation in terminals mainly for the prefix of the verb.
first_indexed 2024-12-13T08:01:14Z
format Article
id doaj.art-2e2c9eef8b814ae6b200cb90a02558fb
institution Directory Open Access Journal
issn 1015-7999
2313-7835
language English
last_indexed 2024-12-13T08:01:14Z
publishDate 2019-12-01
publisher South African Institute of Computer Scientists and Information Technologists
record_format Article
series South African Computer Journal
spelling doaj.art-2e2c9eef8b814ae6b200cb90a02558fb2022-12-21T23:54:26ZengSouth African Institute of Computer Scientists and Information TechnologistsSouth African Computer Journal1015-79992313-78352019-12-0131234–5634–5610.18489/sacj.v31i2.698698A method for measuring verb similarity for two closely related languages with application to Zulu and XhosaZola Mahlaza0Catharina Maria Keet1University of Cape TownUniversity of Cape TownThere are limited computational resources for Nguni languages and when improving availability for one of the languages, bootstrapping from a related language’s resources may be a cost-saving approach. This requires the ability to quantify similarity between any two closely related languages so as to make informed decisions, of which it is unclear how to measure it. We devised a method for quantifying similarity by adapting four extant similar measures, and present a method of quantifying the ratio of verbs that would need phonological conditioning due to consecutive vowels. The verbs selected are those relevant for weather forecasts for Xhosa and Zulu and newly specified as computational grammar rules. The 52 Xhosa and 49 Zulu rules share 42 rules, supporting informal impressions of their similarity. The morphosyntactic similarity reached 59.5% overall on the adapted Driver-Kroeber metric, with past tense rules only at 99.5%. This similarity score is a result of the variation in terminals mainly for the prefix of the verb.https://sacj.cs.uct.ac.za/index.php/sacj/article/view/698
spellingShingle Zola Mahlaza
Catharina Maria Keet
A method for measuring verb similarity for two closely related languages with application to Zulu and Xhosa
South African Computer Journal
title A method for measuring verb similarity for two closely related languages with application to Zulu and Xhosa
title_full A method for measuring verb similarity for two closely related languages with application to Zulu and Xhosa
title_fullStr A method for measuring verb similarity for two closely related languages with application to Zulu and Xhosa
title_full_unstemmed A method for measuring verb similarity for two closely related languages with application to Zulu and Xhosa
title_short A method for measuring verb similarity for two closely related languages with application to Zulu and Xhosa
title_sort method for measuring verb similarity for two closely related languages with application to zulu and xhosa
url https://sacj.cs.uct.ac.za/index.php/sacj/article/view/698
work_keys_str_mv AT zolamahlaza amethodformeasuringverbsimilarityfortwocloselyrelatedlanguageswithapplicationtozuluandxhosa
AT catharinamariakeet amethodformeasuringverbsimilarityfortwocloselyrelatedlanguageswithapplicationtozuluandxhosa
AT zolamahlaza methodformeasuringverbsimilarityfortwocloselyrelatedlanguageswithapplicationtozuluandxhosa
AT catharinamariakeet methodformeasuringverbsimilarityfortwocloselyrelatedlanguageswithapplicationtozuluandxhosa