The Corpus of the Danish Dictionary
A Danish corpus, holding 40 million words of general language from the period 1983-92, was designed and compiled by DSL (The Society for Danish Language and Literature) in order to serve as a major source for a new six volume dictionary of contemporary Danish. The corpus includes written and spoken,...
Main Authors: | , |
---|---|
Format: | Article |
Language: | Afrikaans |
Published: |
Woordeboek van die Afrikaanse Taal-WAT
2012-09-01
|
Series: | Lexikos |
Subjects: | |
Online Access: | http://lexikos.journals.ac.za/pub/article/view/955 |
_version_ | 1828275625047097344 |
---|---|
author | Ole Norling-Christensen Jørg Asmussen |
author_facet | Ole Norling-Christensen Jørg Asmussen |
author_sort | Ole Norling-Christensen |
collection | DOAJ |
description | A Danish corpus, holding 40 million words of general language from the period 1983-92, was designed and compiled by DSL (The Society for Danish Language and Literature) in order to serve as a major source for a new six volume dictionary of contemporary Danish. The corpus includes written and spoken, private and professional, general and specialised language, and each of the 44 000 text samples is annotated with formalized information on these and other features of linguistic and sociological importance. The resulting multidimensional text type specification is useful for the extraction of (virtual or real) subcorpora and for statistical analyses. Specialized software has been developed for flexible interactive concordancing and analysis. The corpus is currently only accessible at the site of DSL; nevertheless, several scholars and students have been using it in their research. The experience gained by the staff of DSL is being reused in co-operative language engineering projects within the European Union, and in 1998 a publicly available corpus will be released as an outcome of the PAROLE project. <p> </p><br><p>&lt;b&gt;Die korpus van die Deense Woordeboek&lt;/b&gt;</p><p>A Danish corpus, holding 40 million words of general language from the period 1983-92, was designed and compiled by DSL (The Society for Danish Language and Literature) in order to serve as a major source for a new six volume dictionary of contemporary Danish. The corpus includes written and spoken, private and professional, general and specialised language, and each of the 44 000 text samples is annotated with formalized information on these and other features of linguistic and sociological importance. The resulting multidimensional text type specification is useful for the extraction of (virtual or real) subcorpora and for statistical analyses. Specialized software has been developed for flexible interactive concordancing and analysis. The corpus is currently only accessible at the site of DSL; nevertheless, several scholars and students have been using it in their research. The experience gained by the staff of DSL is being reused in co-operative language engineering projects within the European Union, and in 1998 a publicly available corpus will be released as an outcome of the PAROLE project.</p><p> </p> |
first_indexed | 2024-04-13T06:48:47Z |
format | Article |
id | doaj.art-dc731752e2584c48984af4a6d1b2f00a |
institution | Directory Open Access Journal |
issn | 1684-4904 2224-0039 |
language | Afrikaans |
last_indexed | 2024-04-13T06:48:47Z |
publishDate | 2012-09-01 |
publisher | Woordeboek van die Afrikaanse Taal-WAT |
record_format | Article |
series | Lexikos |
spelling | doaj.art-dc731752e2584c48984af4a6d1b2f00a2022-12-22T02:57:28ZafrWoordeboek van die Afrikaanse Taal-WATLexikos1684-49042224-00392012-09-018110.5788/8-1-955The Corpus of the Danish DictionaryOle Norling-ChristensenJørg AsmussenA Danish corpus, holding 40 million words of general language from the period 1983-92, was designed and compiled by DSL (The Society for Danish Language and Literature) in order to serve as a major source for a new six volume dictionary of contemporary Danish. The corpus includes written and spoken, private and professional, general and specialised language, and each of the 44 000 text samples is annotated with formalized information on these and other features of linguistic and sociological importance. The resulting multidimensional text type specification is useful for the extraction of (virtual or real) subcorpora and for statistical analyses. Specialized software has been developed for flexible interactive concordancing and analysis. The corpus is currently only accessible at the site of DSL; nevertheless, several scholars and students have been using it in their research. The experience gained by the staff of DSL is being reused in co-operative language engineering projects within the European Union, and in 1998 a publicly available corpus will be released as an outcome of the PAROLE project. <p> </p><br><p>&lt;b&gt;Die korpus van die Deense Woordeboek&lt;/b&gt;</p><p>A Danish corpus, holding 40 million words of general language from the period 1983-92, was designed and compiled by DSL (The Society for Danish Language and Literature) in order to serve as a major source for a new six volume dictionary of contemporary Danish. The corpus includes written and spoken, private and professional, general and specialised language, and each of the 44 000 text samples is annotated with formalized information on these and other features of linguistic and sociological importance. The resulting multidimensional text type specification is useful for the extraction of (virtual or real) subcorpora and for statistical analyses. Specialized software has been developed for flexible interactive concordancing and analysis. The corpus is currently only accessible at the site of DSL; nevertheless, several scholars and students have been using it in their research. The experience gained by the staff of DSL is being reused in co-operative language engineering projects within the European Union, and in 1998 a publicly available corpus will be released as an outcome of the PAROLE project.</p><p> </p>http://lexikos.journals.ac.za/pub/article/view/955concordancecopyrightcorpusdanishdictionaryfrequencylanguage engineeringmutual informationsgmlstatisticssubcorpust-scoretext typologyword distribution |
spellingShingle | Ole Norling-Christensen Jørg Asmussen The Corpus of the Danish Dictionary Lexikos concordance copyright corpus danish dictionary frequency language engineering mutual information sgml statistics subcorpus t-score text typology word distribution |
title | The Corpus of the Danish Dictionary |
title_full | The Corpus of the Danish Dictionary |
title_fullStr | The Corpus of the Danish Dictionary |
title_full_unstemmed | The Corpus of the Danish Dictionary |
title_short | The Corpus of the Danish Dictionary |
title_sort | corpus of the danish dictionary |
topic | concordance copyright corpus danish dictionary frequency language engineering mutual information sgml statistics subcorpus t-score text typology word distribution |
url | http://lexikos.journals.ac.za/pub/article/view/955 |
work_keys_str_mv | AT olenorlingchristensen thecorpusofthedanishdictionary AT jørgasmussen thecorpusofthedanishdictionary AT olenorlingchristensen corpusofthedanishdictionary AT jørgasmussen corpusofthedanishdictionary |