Budowa i zastosowania korpusu monitorującego MoncoPL

This paper introduces the methodology of compiling and maintaining MoncoPL, a large monitor corpus of web-based Polish. Furthermore, an overview of the search engine of the same name is provided to show how the size and composition of the corpus, currently reaching over 5.6 billion word tokens, faci...

Full description

Bibliographic Details
Main Author:	Piotr Pęzik
Format:	Article
Language:	deu
Published:	Wydawnictwo Uniwersytetu Śląskiego / University of Silesia Press 2020-11-01
Series:	Forum Lingwistyczne
Subjects:	MoncoPL monitor corpus Polish diachronic corpora
Online Access:	https://www.journals.us.edu.pl/index.php/FL/article/view/10335

Description
Summary:	This paper introduces the methodology of compiling and maintaining MoncoPL, a large monitor corpus of web-based Polish. Furthermore, an overview of the search engine of the same name is provided to show how the size and composition of the corpus, currently reaching over 5.6 billion word tokens, facilitates research on distributional properties of rare words, neologisms and phraseological units. Finally, the article exemplifies some advantages of using a densely-sampled diachronic corpus for the purposes of observing frequency trends and cycles of various constructions in online media discourse.
ISSN:	2449-9587 2450-2758

Budowa i zastosowania korpusu monitorującego MoncoPL

Similar Items