Das Internet als linguistisches Korpus

This article discusses whether the Internet can be used as a linguistic corpus. It is based on experiences in connection with the Variantenwörterbuch des Deutschen (Dictionary of Standard German Variants), which was compiled 1997-2004. In order to identify national and regional variants of the Germa...

Full description

Bibliographic Details
Main Author: Hans Bickel
Format: Article
Language:deu
Published: Bern Open Publishing 2006-07-01
Series:Linguistik Online
Online Access:https://bop.unibe.ch/linguistik-online/article/view/612
_version_ 1818393680612950016
author Hans Bickel
author_facet Hans Bickel
author_sort Hans Bickel
collection DOAJ
description This article discusses whether the Internet can be used as a linguistic corpus. It is based on experiences in connection with the Variantenwörterbuch des Deutschen (Dictionary of Standard German Variants), which was compiled 1997-2004. In order to identify national and regional variants of the German language in Germany, Austria and Switzerland, it was necessary to work with a large linguistic corpus that could also provide data on the frequency of rather rare words. The question was: Is the Internet suitable as a corpus for linguistic frequency analysis? The use of the WWW as corpus can be suitable only 1. if reliable and reproducible results can be obtained; 2. if the results are closely related to the language as it is actually used. The test showed that the Internet is an extremely useful corpus to get information on word frequency. The enormous size and the large number of different text types makes it an extremely versatile corpus, which has a systematic connection to the written language reality.
first_indexed 2024-12-14T05:49:10Z
format Article
id doaj.art-4ada537a3a5e455cb48a612e1ce2af3e
institution Directory Open Access Journal
issn 1615-3014
language deu
last_indexed 2024-12-14T05:49:10Z
publishDate 2006-07-01
publisher Bern Open Publishing
record_format Article
series Linguistik Online
spelling doaj.art-4ada537a3a5e455cb48a612e1ce2af3e2022-12-21T23:14:46ZdeuBern Open PublishingLinguistik Online1615-30142006-07-0128310.13092/lo.28.612Das Internet als linguistisches KorpusHans BickelThis article discusses whether the Internet can be used as a linguistic corpus. It is based on experiences in connection with the Variantenwörterbuch des Deutschen (Dictionary of Standard German Variants), which was compiled 1997-2004. In order to identify national and regional variants of the German language in Germany, Austria and Switzerland, it was necessary to work with a large linguistic corpus that could also provide data on the frequency of rather rare words. The question was: Is the Internet suitable as a corpus for linguistic frequency analysis? The use of the WWW as corpus can be suitable only 1. if reliable and reproducible results can be obtained; 2. if the results are closely related to the language as it is actually used. The test showed that the Internet is an extremely useful corpus to get information on word frequency. The enormous size and the large number of different text types makes it an extremely versatile corpus, which has a systematic connection to the written language reality.https://bop.unibe.ch/linguistik-online/article/view/612
spellingShingle Hans Bickel
Das Internet als linguistisches Korpus
Linguistik Online
title Das Internet als linguistisches Korpus
title_full Das Internet als linguistisches Korpus
title_fullStr Das Internet als linguistisches Korpus
title_full_unstemmed Das Internet als linguistisches Korpus
title_short Das Internet als linguistisches Korpus
title_sort das internet als linguistisches korpus
url https://bop.unibe.ch/linguistik-online/article/view/612
work_keys_str_mv AT hansbickel dasinternetalslinguistischeskorpus