Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study
The paper presents recent results of a multilevel analysis of representative corpus data, conducted in order to identify key speech parameters (lexical, morphological and syntactic) that can diagnose some social/biological characteristics of a speaker or, more broadly, a modern Russian urban sociole...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
FRUCT
2020-09-01
|
Series: | Proceedings of the XXth Conference of Open Innovations Association FRUCT |
Subjects: | |
Online Access: | https://www.fruct.org/publications/acm27/files/Bog.pdf |
_version_ | 1818436547380248576 |
---|---|
author | Natalia Bogdanova-Beglarian Olga Blinova Tatiana Sherstinova Ekaterina Baeva Daria Gorbunova Tatiana Popova |
author_facet | Natalia Bogdanova-Beglarian Olga Blinova Tatiana Sherstinova Ekaterina Baeva Daria Gorbunova Tatiana Popova |
author_sort | Natalia Bogdanova-Beglarian |
collection | DOAJ |
description | The paper presents recent results of a multilevel analysis of representative corpus data, conducted in order to identify key speech parameters (lexical, morphological and syntactic) that can diagnose some social/biological characteristics of a speaker or, more broadly, a modern Russian urban sociolect. The study is based on the everyday Russian speech corpus One Speakers Day. Specific data were obtained on the analysis of the annotated subcorpus of 289,205 tokens, which includes recorded speech days of 57 men and 48 women, which were the research participants, as well as speech fragments of 87 men and 139 women, which were their interlocutors. Thus, the total number of speakers in the subsample amounts to 144 men and 187 women. The article also begs the question of Data Mining approach usability to the subcorpus and possibilities of further research using machine learning. The results obtained are important for the optimization of speech technologies systems, for theoretical understanding of linguistic processes, as well as for monitoring various social processes taking place in modern Russian metropolis. |
first_indexed | 2024-12-14T17:10:31Z |
format | Article |
id | doaj.art-3125bdbaa8b2482aa7a0c171b8ff3805 |
institution | Directory Open Access Journal |
issn | 2305-7254 2343-0737 |
language | English |
last_indexed | 2024-12-14T17:10:31Z |
publishDate | 2020-09-01 |
publisher | FRUCT |
record_format | Article |
series | Proceedings of the XXth Conference of Open Innovations Association FRUCT |
spelling | doaj.art-3125bdbaa8b2482aa7a0c171b8ff38052022-12-21T22:53:34ZengFRUCTProceedings of the XXth Conference of Open Innovations Association FRUCT2305-72542343-07372020-09-0127228829310.5281/zenodo.4026188Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based StudyNatalia Bogdanova-Beglarian0Olga Blinova1Tatiana Sherstinova2Ekaterina Baeva3Daria Gorbunova4Tatiana Popova5Saint-Petersburg State University, RussiaSaint Petersburg State University, RussiaSt. Petersburg State University, RussiaSt Petersburg state university, RussiaSPBU, RussiaSPBU, RussiaThe paper presents recent results of a multilevel analysis of representative corpus data, conducted in order to identify key speech parameters (lexical, morphological and syntactic) that can diagnose some social/biological characteristics of a speaker or, more broadly, a modern Russian urban sociolect. The study is based on the everyday Russian speech corpus One Speakers Day. Specific data were obtained on the analysis of the annotated subcorpus of 289,205 tokens, which includes recorded speech days of 57 men and 48 women, which were the research participants, as well as speech fragments of 87 men and 139 women, which were their interlocutors. Thus, the total number of speakers in the subsample amounts to 144 men and 187 women. The article also begs the question of Data Mining approach usability to the subcorpus and possibilities of further research using machine learning. The results obtained are important for the optimization of speech technologies systems, for theoretical understanding of linguistic processes, as well as for monitoring various social processes taking place in modern Russian metropolis.https://www.fruct.org/publications/acm27/files/Bog.pdfnlprussianeveryday speechpragmatcssociolinguisticsspeech corpus |
spellingShingle | Natalia Bogdanova-Beglarian Olga Blinova Tatiana Sherstinova Ekaterina Baeva Daria Gorbunova Tatiana Popova Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study Proceedings of the XXth Conference of Open Innovations Association FRUCT nlp russian everyday speech pragmatcs sociolinguistics speech corpus |
title | Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study |
title_full | Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study |
title_fullStr | Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study |
title_full_unstemmed | Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study |
title_short | Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study |
title_sort | sociolinguistic variability of russian everyday speech a corpus based study |
topic | nlp russian everyday speech pragmatcs sociolinguistics speech corpus |
url | https://www.fruct.org/publications/acm27/files/Bog.pdf |
work_keys_str_mv | AT nataliabogdanovabeglarian sociolinguisticvariabilityofrussianeverydayspeechacorpusbasedstudy AT olgablinova sociolinguisticvariabilityofrussianeverydayspeechacorpusbasedstudy AT tatianasherstinova sociolinguisticvariabilityofrussianeverydayspeechacorpusbasedstudy AT ekaterinabaeva sociolinguisticvariabilityofrussianeverydayspeechacorpusbasedstudy AT dariagorbunova sociolinguisticvariabilityofrussianeverydayspeechacorpusbasedstudy AT tatianapopova sociolinguisticvariabilityofrussianeverydayspeechacorpusbasedstudy |