Challenges to Issues of Balance and Representativeness in African Lexicography

<p>Abstract: Modern dictionaries depend on corpora of different sizes and types for frequency listings, concordances and collocations, illustrative sentences and grammatical information. With the help of computer software, retrieving such information has increasingly become relatively...

Full description

Bibliographic Details
Main Author: Thapelo Joseph Otlogetswe
Format: Article
Language:Afrikaans
Published: Woordeboek van die Afrikaanse Taal-WAT 2011-10-01
Series:Lexikos
Subjects:
Online Access:http://lexikos.journals.ac.za/pub/article/view/653
_version_ 1811241817604096000
author Thapelo Joseph Otlogetswe
author_facet Thapelo Joseph Otlogetswe
author_sort Thapelo Joseph Otlogetswe
collection DOAJ
description <p>Abstract: Modern dictionaries depend on corpora of different sizes and types for frequency listings, concordances and collocations, illustrative sentences and grammatical information. With the help of computer software, retrieving such information has increasingly become relatively easy. However, the quality of retrieved information for lexicographic purposes depends on the information input at the stage of corpus construction. If corpora are not representative of the different language usages of a speech community, they may prove to be unreliable sources of lexicographic information. There are, however, issues in African languages which make many African corpora questionable. These issues include a lack of texts of different genres, the unavailability of balanced and representative written texts, a complete absence of spoken texts as well as literacy problems in African societies. This article therefore explores the different challenges to the construction of reliable corpora in African languages. It argues that African languages face peculiar challenges and corpus research may require a different treatment compared to European and American corpus research. It finally concludes that issues of balance and representativeness appear theoretically impossible when looking at the results of sociolinguistic research on the different existing language varieties which are difficult to represent accurately in a corpus.</p><p>Keywords: AFRICAN LANGUAGES, BALANCE, BANK OF ENGLISH, BORROWING,BRITISH NATIONAL CORPUS, COBUILD, CODE-SWITCHING, COMPUTERS, CORPORA,DIALECT, DICTIONARIES, FREQUENCY, LANGUAGE VARIETY, REPRESENTATIVENESS,SETSWANA, SOCIOLINGUISTICS, SPEECH, TEXT</p><p>Opsomming: Uitdagings betreffende kwessies van balans en verteenwoordigendheidin Afrikaleksikografie. Moderne woordeboeke steun op korpusse vanverskillende groottes en soorte vir frekwensielyste, konkordansies en kollokasies, voorbeeldsinneen taalkundige inligting. Met die hulp van rekenaarprogrammatuur het die herwinning van sulkeinligting toenemend redelik maklik geword. Die gehalte van herwonne inligting vir leksikografiesedoeleindes steun egter op die inligtingsinset by die korpusboufase. Indien korpusse nie verteenwoordigendis van die verskillende taalgebruike van 'n spraakgemeenskap nie, mag hulle blyk onbetroubare bronne van leksikografiese inligting te wees. Daar is egter kwessies in Afrikatale watbaie Afrikakorpusse problematies maak. Hierdie kwessies sluit in die tekort aan tekste van verskillendegenres, die niebeskikbaarheid van gebalanseerde en verteenwoordigende geskrewe tekste,die volkome afwesigheid van gesproke tekste asook geletterdheidsprobleme in Afrikagemeenskappe.Hierdie artikel ondersoek derhalwe die verskillende uitdagings betreffende die bou vanbetroubare Afrikataalkorpusse. Dit voer aan dat Afrikatale teenoor besondere uitdagings staan enkorpusnavorsing 'n verskillende behandeling mag vereis in vergelyking met Europese en Amerikaansekorpusnavorsing. Ten slotte kom dit tot die gevolgtrekking dat kwessies van balans enverteenwoordigendheid teoreties onmoontlik lyk wanneer gekyk word na die resultate van sosiolinguistiesenavorsing oor die verskillende bestaande taalvariëteite wat moeilik is om presies in 'nkorpus te verteenwoordig.</p><p>Sleutelwoorde: AFRIKATALE, BALANS, BANK OF ENGLISH, BRITISH NATIONALCORPUS, COBUILD, DIALEK, FREKWENSIE, KODEWISSELING, KORPUSSE, ONTLENING,REKENAARS, SETSWANA, SOSIOLINGUISTIEK, SPRAAK, TAALVERSKEIDENHEID, TEKS,VERTEENWOORDIGENDHEID, WOORDEBOEKE</p>
first_indexed 2024-04-12T13:41:53Z
format Article
id doaj.art-39d307243d2b4cedba418625b25c737f
institution Directory Open Access Journal
issn 1684-4904
2224-0039
language Afrikaans
last_indexed 2024-04-12T13:41:53Z
publishDate 2011-10-01
publisher Woordeboek van die Afrikaanse Taal-WAT
record_format Article
series Lexikos
spelling doaj.art-39d307243d2b4cedba418625b25c737f2022-12-22T03:30:49ZafrWoordeboek van die Afrikaanse Taal-WATLexikos1684-49042224-00392011-10-011610.5788/16--653Challenges to Issues of Balance and Representativeness in African LexicographyThapelo Joseph Otlogetswe<p>Abstract: Modern dictionaries depend on corpora of different sizes and types for frequency listings, concordances and collocations, illustrative sentences and grammatical information. With the help of computer software, retrieving such information has increasingly become relatively easy. However, the quality of retrieved information for lexicographic purposes depends on the information input at the stage of corpus construction. If corpora are not representative of the different language usages of a speech community, they may prove to be unreliable sources of lexicographic information. There are, however, issues in African languages which make many African corpora questionable. These issues include a lack of texts of different genres, the unavailability of balanced and representative written texts, a complete absence of spoken texts as well as literacy problems in African societies. This article therefore explores the different challenges to the construction of reliable corpora in African languages. It argues that African languages face peculiar challenges and corpus research may require a different treatment compared to European and American corpus research. It finally concludes that issues of balance and representativeness appear theoretically impossible when looking at the results of sociolinguistic research on the different existing language varieties which are difficult to represent accurately in a corpus.</p><p>Keywords: AFRICAN LANGUAGES, BALANCE, BANK OF ENGLISH, BORROWING,BRITISH NATIONAL CORPUS, COBUILD, CODE-SWITCHING, COMPUTERS, CORPORA,DIALECT, DICTIONARIES, FREQUENCY, LANGUAGE VARIETY, REPRESENTATIVENESS,SETSWANA, SOCIOLINGUISTICS, SPEECH, TEXT</p><p>Opsomming: Uitdagings betreffende kwessies van balans en verteenwoordigendheidin Afrikaleksikografie. Moderne woordeboeke steun op korpusse vanverskillende groottes en soorte vir frekwensielyste, konkordansies en kollokasies, voorbeeldsinneen taalkundige inligting. Met die hulp van rekenaarprogrammatuur het die herwinning van sulkeinligting toenemend redelik maklik geword. Die gehalte van herwonne inligting vir leksikografiesedoeleindes steun egter op die inligtingsinset by die korpusboufase. Indien korpusse nie verteenwoordigendis van die verskillende taalgebruike van 'n spraakgemeenskap nie, mag hulle blyk onbetroubare bronne van leksikografiese inligting te wees. Daar is egter kwessies in Afrikatale watbaie Afrikakorpusse problematies maak. Hierdie kwessies sluit in die tekort aan tekste van verskillendegenres, die niebeskikbaarheid van gebalanseerde en verteenwoordigende geskrewe tekste,die volkome afwesigheid van gesproke tekste asook geletterdheidsprobleme in Afrikagemeenskappe.Hierdie artikel ondersoek derhalwe die verskillende uitdagings betreffende die bou vanbetroubare Afrikataalkorpusse. Dit voer aan dat Afrikatale teenoor besondere uitdagings staan enkorpusnavorsing 'n verskillende behandeling mag vereis in vergelyking met Europese en Amerikaansekorpusnavorsing. Ten slotte kom dit tot die gevolgtrekking dat kwessies van balans enverteenwoordigendheid teoreties onmoontlik lyk wanneer gekyk word na die resultate van sosiolinguistiesenavorsing oor die verskillende bestaande taalvariëteite wat moeilik is om presies in 'nkorpus te verteenwoordig.</p><p>Sleutelwoorde: AFRIKATALE, BALANS, BANK OF ENGLISH, BRITISH NATIONALCORPUS, COBUILD, DIALEK, FREKWENSIE, KODEWISSELING, KORPUSSE, ONTLENING,REKENAARS, SETSWANA, SOSIOLINGUISTIEK, SPRAAK, TAALVERSKEIDENHEID, TEKS,VERTEENWOORDIGENDHEID, WOORDEBOEKE</p>http://lexikos.journals.ac.za/pub/article/view/653AFRICAN LANGUAGESBALANCEBANK OF ENGLISHBORROWINGBRITISH NATIONAL CORPUSCOBUILDCODE-SWITCHINGCOMPUTERSCORPORADIALECTDICTIONARIESFREQUENCYLANGUAGE VARIETYREPRESENTATIVENESSSETSWANASOCIOLINGUISTICSSPEECHTEXT
spellingShingle Thapelo Joseph Otlogetswe
Challenges to Issues of Balance and Representativeness in African Lexicography
Lexikos
AFRICAN LANGUAGES
BALANCE
BANK OF ENGLISH
BORROWING
BRITISH NATIONAL CORPUS
COBUILD
CODE-SWITCHING
COMPUTERS
CORPORA
DIALECT
DICTIONARIES
FREQUENCY
LANGUAGE VARIETY
REPRESENTATIVENESS
SETSWANA
SOCIOLINGUISTICS
SPEECH
TEXT
title Challenges to Issues of Balance and Representativeness in African Lexicography
title_full Challenges to Issues of Balance and Representativeness in African Lexicography
title_fullStr Challenges to Issues of Balance and Representativeness in African Lexicography
title_full_unstemmed Challenges to Issues of Balance and Representativeness in African Lexicography
title_short Challenges to Issues of Balance and Representativeness in African Lexicography
title_sort challenges to issues of balance and representativeness in african lexicography
topic AFRICAN LANGUAGES
BALANCE
BANK OF ENGLISH
BORROWING
BRITISH NATIONAL CORPUS
COBUILD
CODE-SWITCHING
COMPUTERS
CORPORA
DIALECT
DICTIONARIES
FREQUENCY
LANGUAGE VARIETY
REPRESENTATIVENESS
SETSWANA
SOCIOLINGUISTICS
SPEECH
TEXT
url http://lexikos.journals.ac.za/pub/article/view/653
work_keys_str_mv AT thapelojosephotlogetswe challengestoissuesofbalanceandrepresentativenessinafricanlexicography