Schweizer Text Korpus – Theoretische Grundlagen, Korpusdesign und Abfragemöglichkeiten

The SWISS TEXT CORPUS (CHTK) has made it its goal to extensively document the German language of the 20th century in Switzerland. In this way, and in its parallel function as a sub-corpus of the Corpus C4, that will consist of 20 million text words (tokens) each from Germany, Austria, Italy/South Ti...

Full description

Bibliographic Details
Main Authors: Bickel, Hans, Gasser, Markus, Häcki Buhofer, Annelies, Hofer, Lorenz, Schön, Christoph
Format: Article
Language:deu
Published: Bern Open Publishing 2009-01-01
Series:Linguistik Online
Online Access:http://www.linguistik-online.de/39_09/bickelEtAl.pdf
_version_ 1819095108135419904
author Bickel, Hans
Gasser, Markus
Häcki Buhofer, Annelies
Hofer, Lorenz
Schön, Christoph
author_facet Bickel, Hans
Gasser, Markus
Häcki Buhofer, Annelies
Hofer, Lorenz
Schön, Christoph
author_sort Bickel, Hans
collection DOAJ
description The SWISS TEXT CORPUS (CHTK) has made it its goal to extensively document the German language of the 20th century in Switzerland. In this way, and in its parallel function as a sub-corpus of the Corpus C4, that will consist of 20 million text words (tokens) each from Germany, Austria, Italy/South Tirol and, as already said, Switzerland, it represents a classical reference corpus both for the standard German language in Switzerland as well as in the entire German-speaking area of Western Europe. A reference corpus should meet the requirement of comprehensively depicting the central repertoire of a language, i.e. the generally used vocabulary of this language, which is why questions of corpus structure and general planning (corpus design) play a decisive role (cf. Lemnitzer/Zinsmeister (2006: 106), where the type of the reference corpus is contrasted with the special corpus). Four and a half years after the start of the project, the SWISS TEXT CORPUS was made available to the general public in April 2009, as a research instrument. The following article outlines in brief the history of this research project and deals with fundamental and specific decisions that had to be made in the design of such a reference corpus, and with how the CHTK is compiled. Together with a concluding overview of some retrieval and analysis options offered by the CHTK, this article also provides an overview of the potential of this new research instrument and supplies the background knowledge required to work with the CHTK. For reasons of space, the methods of working, the corpus-driven approaches, cannot be thematised here (cf. Bubenhofer 2008, 2006).
first_indexed 2024-12-21T23:38:03Z
format Article
id doaj.art-23506326b48544beab3e326edfda2562
institution Directory Open Access Journal
issn 1615-3014
language deu
last_indexed 2024-12-21T23:38:03Z
publishDate 2009-01-01
publisher Bern Open Publishing
record_format Article
series Linguistik Online
spelling doaj.art-23506326b48544beab3e326edfda25622022-12-21T18:46:19ZdeuBern Open PublishingLinguistik Online1615-30142009-01-01393531Schweizer Text Korpus – Theoretische Grundlagen, Korpusdesign und AbfragemöglichkeitenBickel, HansGasser, MarkusHäcki Buhofer, AnneliesHofer, LorenzSchön, ChristophThe SWISS TEXT CORPUS (CHTK) has made it its goal to extensively document the German language of the 20th century in Switzerland. In this way, and in its parallel function as a sub-corpus of the Corpus C4, that will consist of 20 million text words (tokens) each from Germany, Austria, Italy/South Tirol and, as already said, Switzerland, it represents a classical reference corpus both for the standard German language in Switzerland as well as in the entire German-speaking area of Western Europe. A reference corpus should meet the requirement of comprehensively depicting the central repertoire of a language, i.e. the generally used vocabulary of this language, which is why questions of corpus structure and general planning (corpus design) play a decisive role (cf. Lemnitzer/Zinsmeister (2006: 106), where the type of the reference corpus is contrasted with the special corpus). Four and a half years after the start of the project, the SWISS TEXT CORPUS was made available to the general public in April 2009, as a research instrument. The following article outlines in brief the history of this research project and deals with fundamental and specific decisions that had to be made in the design of such a reference corpus, and with how the CHTK is compiled. Together with a concluding overview of some retrieval and analysis options offered by the CHTK, this article also provides an overview of the potential of this new research instrument and supplies the background knowledge required to work with the CHTK. For reasons of space, the methods of working, the corpus-driven approaches, cannot be thematised here (cf. Bubenhofer 2008, 2006).http://www.linguistik-online.de/39_09/bickelEtAl.pdf
spellingShingle Bickel, Hans
Gasser, Markus
Häcki Buhofer, Annelies
Hofer, Lorenz
Schön, Christoph
Schweizer Text Korpus – Theoretische Grundlagen, Korpusdesign und Abfragemöglichkeiten
Linguistik Online
title Schweizer Text Korpus – Theoretische Grundlagen, Korpusdesign und Abfragemöglichkeiten
title_full Schweizer Text Korpus – Theoretische Grundlagen, Korpusdesign und Abfragemöglichkeiten
title_fullStr Schweizer Text Korpus – Theoretische Grundlagen, Korpusdesign und Abfragemöglichkeiten
title_full_unstemmed Schweizer Text Korpus – Theoretische Grundlagen, Korpusdesign und Abfragemöglichkeiten
title_short Schweizer Text Korpus – Theoretische Grundlagen, Korpusdesign und Abfragemöglichkeiten
title_sort schweizer text korpus theoretische grundlagen korpusdesign und abfragemoglichkeiten
url http://www.linguistik-online.de/39_09/bickelEtAl.pdf
work_keys_str_mv AT bickelhans schweizertextkorpustheoretischegrundlagenkorpusdesignundabfragemoglichkeiten
AT gassermarkus schweizertextkorpustheoretischegrundlagenkorpusdesignundabfragemoglichkeiten
AT hackibuhoferannelies schweizertextkorpustheoretischegrundlagenkorpusdesignundabfragemoglichkeiten
AT hoferlorenz schweizertextkorpustheoretischegrundlagenkorpusdesignundabfragemoglichkeiten
AT schonchristoph schweizertextkorpustheoretischegrundlagenkorpusdesignundabfragemoglichkeiten