Līdzsvarots mūsdienu latviešu valodas tekstu korpuss un tā tekstu atlases kritēriji

<p><strong>THE BALANCED CORPUS OF MODERN LATVIAN AND THE TEXT SELECTION CRITERIA</strong></p><p><em>Summary</em></p><p>Recently <em>The Balanced Corpus of Modern Latvian</em> (~3.5 million running words) has been created in the Instit...

Full description

Bibliographic Details
Main Author: Kristīne Levāne-Petrova
Format: Article
Language:deu
Published: Vilnius University 2012-04-01
Series:Baltistica
Subjects:
Online Access:http://www.baltistica.lt/index.php/baltistica/article/view/2113
_version_ 1818155237661212672
author Kristīne Levāne-Petrova
author_facet Kristīne Levāne-Petrova
author_sort Kristīne Levāne-Petrova
collection DOAJ
description <p><strong>THE BALANCED CORPUS OF MODERN LATVIAN AND THE TEXT SELECTION CRITERIA</strong></p><p><em>Summary</em></p><p>Recently <em>The Balanced Corpus of Modern Latvian</em> (~3.5 million running words) has been created in the Institute of Mathematics and Computer Science (IMCS) (see <a href="http://www.korpuss.lv" target="_blank">http://www.korpuss.lv</a>). The Corpus has been compiled from printed and electronic materials created after 1990. The Corpus is automatically morphologically tagged: for each token all the syntactically valid interpretations are stored.</p><p>Texts for the Corpus were chosen according to different text selection criteria: for instance, time, media, domain, etc. This article discusses the text selection criteria chosen for this Corpus, problems related to Corpus design and text selection criteria, solutions found for these problems and future plans regarding the Corpus.</p>
first_indexed 2024-12-11T14:39:13Z
format Article
id doaj.art-1706349189e7497a8c40c25a0a9c622a
institution Directory Open Access Journal
issn 0132-6503
2345-0045
language deu
last_indexed 2024-12-11T14:39:13Z
publishDate 2012-04-01
publisher Vilnius University
record_format Article
series Baltistica
spelling doaj.art-1706349189e7497a8c40c25a0a9c622a2022-12-22T01:02:00ZdeuVilnius UniversityBaltistica0132-65032345-00452012-04-0108899810.15388/baltistica.0.8.21132006Līdzsvarots mūsdienu latviešu valodas tekstu korpuss un tā tekstu atlases kritērijiKristīne Levāne-Petrova<p><strong>THE BALANCED CORPUS OF MODERN LATVIAN AND THE TEXT SELECTION CRITERIA</strong></p><p><em>Summary</em></p><p>Recently <em>The Balanced Corpus of Modern Latvian</em> (~3.5 million running words) has been created in the Institute of Mathematics and Computer Science (IMCS) (see <a href="http://www.korpuss.lv" target="_blank">http://www.korpuss.lv</a>). The Corpus has been compiled from printed and electronic materials created after 1990. The Corpus is automatically morphologically tagged: for each token all the syntactically valid interpretations are stored.</p><p>Texts for the Corpus were chosen according to different text selection criteria: for instance, time, media, domain, etc. This article discusses the text selection criteria chosen for this Corpus, problems related to Corpus design and text selection criteria, solutions found for these problems and future plans regarding the Corpus.</p>http://www.baltistica.lt/index.php/baltistica/article/view/2113The Balanced Corpus of Modern Latviancomputer linguistics
spellingShingle Kristīne Levāne-Petrova
Līdzsvarots mūsdienu latviešu valodas tekstu korpuss un tā tekstu atlases kritēriji
Baltistica
The Balanced Corpus of Modern Latvian
computer linguistics
title Līdzsvarots mūsdienu latviešu valodas tekstu korpuss un tā tekstu atlases kritēriji
title_full Līdzsvarots mūsdienu latviešu valodas tekstu korpuss un tā tekstu atlases kritēriji
title_fullStr Līdzsvarots mūsdienu latviešu valodas tekstu korpuss un tā tekstu atlases kritēriji
title_full_unstemmed Līdzsvarots mūsdienu latviešu valodas tekstu korpuss un tā tekstu atlases kritēriji
title_short Līdzsvarots mūsdienu latviešu valodas tekstu korpuss un tā tekstu atlases kritēriji
title_sort lidzsvarots musdienu latviesu valodas tekstu korpuss un ta tekstu atlases kriteriji
topic The Balanced Corpus of Modern Latvian
computer linguistics
url http://www.baltistica.lt/index.php/baltistica/article/view/2113
work_keys_str_mv AT kristinelevanepetrova lidzsvarotsmusdienulatviesuvalodastekstukorpussuntatekstuatlaseskriteriji