Semi-Automatic Retrieval of Definitional Information: A Northern Sotho Case Study*

<p>Abstract: Corpus-based terminology is currently gaining ground on the international front. Itis therefore important that terminologists working on the South African Bantu languages not onlytake note of this development, but that they should also follow this trend, even if they do no...

Full description

Bibliographic Details
Main Author: Elsabé Taljard
Format: Article
Language:Afrikaans
Published: Woordeboek van die Afrikaanse Taal-WAT 2011-10-01
Series:Lexikos
Subjects:
Online Access:http://lexikos.journals.ac.za/pub/article/view/689
_version_ 1818196754581946368
author Elsabé Taljard
author_facet Elsabé Taljard
author_sort Elsabé Taljard
collection DOAJ
description <p>Abstract: Corpus-based terminology is currently gaining ground on the international front. Itis therefore important that terminologists working on the South African Bantu languages not onlytake note of this development, but that they should also follow this trend, even if they do not havethe same measure of access to highly sophisticated software. The aim of this article is therefore toestablish whether it is possible to retrieve definitional information on key concepts from untagged,running text by making use of affordable and easily accessible software such as WordSmith Tools. Inorder to answer this question, a case study is done in Northern Sotho, using textual material onlinguistics as basis for a special field corpus. Syntactic and lexical patterns serving as textual markersof definitional information are identified and the success rate of the computational retrieval ofdefinitional information is analysed and evaluated. Attention is also paid to the retrieval of specificallyconceptual information, which turned out to be a fortunate by-product of semi-automaticretrieval of definitional information. Finally, it is illustrated how definitional information retrievedcan be utilised in the writing of a formal terminological definition.</p><p>Keywords: TERMINOLOGY, SOUTH AFRICAN BANTU LANGUAGES, DEFINITIONALINFORMATION, SEMI-AUTOMATIC INFORMATION RETRIEVAL, TERMINOLOGICAL DEFINITIONS,CONCEPTUAL RELATIONSHIPS, LEXICAL PATTERNS, SYNTACTIC PATTERNS,TEXTUAL MARKERS, KEYWORD-IN-CONTEXT (KWIC), WORDSMITH TOOLS</p><p>Opsomming: Semi-outomatiese herwinning van definisie-inligting: 'n Noord-Sothogevallestudie. Korpus-gebaseerde terminologie is tans besig om veld te wen op dieinternasionale front. Dit is daarom belangrik dat terminoloë wat binne die Suid-Afrikaanse Bantoetalewerk, nie net sal kennis neem van hierdie ontwikkeling nie, maar dat hulle ook hierdie neigingsal volg, selfs al het hulle nie dieselfde mate van toegang tot gesofistikeerde rekenaarprogrammatuurnie. Die doel van hierdie artikel is daarom om vas te stel of dit moontlik is om definisie-inligtingoor sleutelkonsepte uit ongemerkte, lopende teks te herwin deur bekostigbare en toeganklikesagteware soos WordSmith Tools te gebruik. Ten einde hierdie vraag te beantwoord, is 'n gevallestudiein Noord-Sotho gedoen, met gebruikmaking van teksmateriaal oor die linguistiek as basisvir 'n gespesialiseerde korpus. Sintaktiese en leksikale patrone wat as tekstuele merkers van definisie-inligting dien, word geïdentifiseer en die suksesratio van rekenaarmatige herwinning vandefinisie-inligting word ontleed en beoordeel. Aandag word ook gegee aan die herwinning vanspesifiek konseptuele inligting, wat 'n onverwagse byproduk van die semi-outomatiese herwinningvan definisie-inligting is. Ten slotte word geïllustreer hoe definisie-inligting aangewend kan wordby die skryf van 'n formele terminologiese definisie.</p><p>Sleutelwoorde: TERMINOLOGIE, SUID-AFRIKAANSE BANTOETALE, DEFINISIE-INLIGTING,SEMI-OUTOMATIESE INLIGTINGSHERWINNING, TERMINOLOGIESE DEFINISIES,KONSEPTUELE VERHOUDINGE, LEKSIKALE PATRONE, SINTAKTIESE PATRONE, TEKSTUELEMERKERS, KEYWORD-IN-CONTEXT (KWIC), WORDSMITH TOOLS</p>
first_indexed 2024-12-12T01:39:06Z
format Article
id doaj.art-a4d8050fd8e542f0b22ea2153363d111
institution Directory Open Access Journal
issn 1684-4904
2224-0039
language Afrikaans
last_indexed 2024-12-12T01:39:06Z
publishDate 2011-10-01
publisher Woordeboek van die Afrikaanse Taal-WAT
record_format Article
series Lexikos
spelling doaj.art-a4d8050fd8e542f0b22ea2153363d1112022-12-22T00:42:46ZafrWoordeboek van die Afrikaanse Taal-WATLexikos1684-49042224-00392011-10-011410.5788/14--689Semi-Automatic Retrieval of Definitional Information: A Northern Sotho Case Study*Elsabé Taljard<p>Abstract: Corpus-based terminology is currently gaining ground on the international front. Itis therefore important that terminologists working on the South African Bantu languages not onlytake note of this development, but that they should also follow this trend, even if they do not havethe same measure of access to highly sophisticated software. The aim of this article is therefore toestablish whether it is possible to retrieve definitional information on key concepts from untagged,running text by making use of affordable and easily accessible software such as WordSmith Tools. Inorder to answer this question, a case study is done in Northern Sotho, using textual material onlinguistics as basis for a special field corpus. Syntactic and lexical patterns serving as textual markersof definitional information are identified and the success rate of the computational retrieval ofdefinitional information is analysed and evaluated. Attention is also paid to the retrieval of specificallyconceptual information, which turned out to be a fortunate by-product of semi-automaticretrieval of definitional information. Finally, it is illustrated how definitional information retrievedcan be utilised in the writing of a formal terminological definition.</p><p>Keywords: TERMINOLOGY, SOUTH AFRICAN BANTU LANGUAGES, DEFINITIONALINFORMATION, SEMI-AUTOMATIC INFORMATION RETRIEVAL, TERMINOLOGICAL DEFINITIONS,CONCEPTUAL RELATIONSHIPS, LEXICAL PATTERNS, SYNTACTIC PATTERNS,TEXTUAL MARKERS, KEYWORD-IN-CONTEXT (KWIC), WORDSMITH TOOLS</p><p>Opsomming: Semi-outomatiese herwinning van definisie-inligting: 'n Noord-Sothogevallestudie. Korpus-gebaseerde terminologie is tans besig om veld te wen op dieinternasionale front. Dit is daarom belangrik dat terminoloë wat binne die Suid-Afrikaanse Bantoetalewerk, nie net sal kennis neem van hierdie ontwikkeling nie, maar dat hulle ook hierdie neigingsal volg, selfs al het hulle nie dieselfde mate van toegang tot gesofistikeerde rekenaarprogrammatuurnie. Die doel van hierdie artikel is daarom om vas te stel of dit moontlik is om definisie-inligtingoor sleutelkonsepte uit ongemerkte, lopende teks te herwin deur bekostigbare en toeganklikesagteware soos WordSmith Tools te gebruik. Ten einde hierdie vraag te beantwoord, is 'n gevallestudiein Noord-Sotho gedoen, met gebruikmaking van teksmateriaal oor die linguistiek as basisvir 'n gespesialiseerde korpus. Sintaktiese en leksikale patrone wat as tekstuele merkers van definisie-inligting dien, word geïdentifiseer en die suksesratio van rekenaarmatige herwinning vandefinisie-inligting word ontleed en beoordeel. Aandag word ook gegee aan die herwinning vanspesifiek konseptuele inligting, wat 'n onverwagse byproduk van die semi-outomatiese herwinningvan definisie-inligting is. Ten slotte word geïllustreer hoe definisie-inligting aangewend kan wordby die skryf van 'n formele terminologiese definisie.</p><p>Sleutelwoorde: TERMINOLOGIE, SUID-AFRIKAANSE BANTOETALE, DEFINISIE-INLIGTING,SEMI-OUTOMATIESE INLIGTINGSHERWINNING, TERMINOLOGIESE DEFINISIES,KONSEPTUELE VERHOUDINGE, LEKSIKALE PATRONE, SINTAKTIESE PATRONE, TEKSTUELEMERKERS, KEYWORD-IN-CONTEXT (KWIC), WORDSMITH TOOLS</p>http://lexikos.journals.ac.za/pub/article/view/689TERMINOLOGYSOUTH AFRICAN BANTU LANGUAGESDEFINITIONAL INFORMATIONSEMI-AUTOMATIC INFORMATION RETRIEVALTERMINOLOGICAL DEFINITIONSCONCEPTUAL RELATIONSHIPSLEXICAL PATTERNSSYNTACTIC PATTERNSTEXTUAL MARKERSKEYWORD-IN-CONTEXT (KWIC)WORDSMITH TOOLS
spellingShingle Elsabé Taljard
Semi-Automatic Retrieval of Definitional Information: A Northern Sotho Case Study*
Lexikos
TERMINOLOGY
SOUTH AFRICAN BANTU LANGUAGES
DEFINITIONAL INFORMATION
SEMI-AUTOMATIC INFORMATION RETRIEVAL
TERMINOLOGICAL DEFINITIONS
CONCEPTUAL RELATIONSHIPS
LEXICAL PATTERNS
SYNTACTIC PATTERNS
TEXTUAL MARKERS
KEYWORD-IN-CONTEXT (KWIC)
WORDSMITH TOOLS
title Semi-Automatic Retrieval of Definitional Information: A Northern Sotho Case Study*
title_full Semi-Automatic Retrieval of Definitional Information: A Northern Sotho Case Study*
title_fullStr Semi-Automatic Retrieval of Definitional Information: A Northern Sotho Case Study*
title_full_unstemmed Semi-Automatic Retrieval of Definitional Information: A Northern Sotho Case Study*
title_short Semi-Automatic Retrieval of Definitional Information: A Northern Sotho Case Study*
title_sort semi automatic retrieval of definitional information a northern sotho case study
topic TERMINOLOGY
SOUTH AFRICAN BANTU LANGUAGES
DEFINITIONAL INFORMATION
SEMI-AUTOMATIC INFORMATION RETRIEVAL
TERMINOLOGICAL DEFINITIONS
CONCEPTUAL RELATIONSHIPS
LEXICAL PATTERNS
SYNTACTIC PATTERNS
TEXTUAL MARKERS
KEYWORD-IN-CONTEXT (KWIC)
WORDSMITH TOOLS
url http://lexikos.journals.ac.za/pub/article/view/689
work_keys_str_mv AT elsabetaljard semiautomaticretrievalofdefinitionalinformationanorthernsothocasestudy