Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study

The paper presents recent results of a multilevel analysis of representative corpus data, conducted in order to identify key speech parameters (lexical, morphological and syntactic) that can diagnose some social/biological characteristics of a speaker or, more broadly, a modern Russian urban sociole...

Full description

Bibliographic Details
Main Authors: Natalia Bogdanova-Beglarian, Olga Blinova, Tatiana Sherstinova, Ekaterina Baeva, Daria Gorbunova, Tatiana Popova
Format: Article
Language:English
Published: FRUCT 2020-09-01
Series:Proceedings of the XXth Conference of Open Innovations Association FRUCT
Subjects:
Online Access:https://www.fruct.org/publications/acm27/files/Bog.pdf
_version_ 1818436547380248576
author Natalia Bogdanova-Beglarian
Olga Blinova
Tatiana Sherstinova
Ekaterina Baeva
Daria Gorbunova
Tatiana Popova
author_facet Natalia Bogdanova-Beglarian
Olga Blinova
Tatiana Sherstinova
Ekaterina Baeva
Daria Gorbunova
Tatiana Popova
author_sort Natalia Bogdanova-Beglarian
collection DOAJ
description The paper presents recent results of a multilevel analysis of representative corpus data, conducted in order to identify key speech parameters (lexical, morphological and syntactic) that can diagnose some social/biological characteristics of a speaker or, more broadly, a modern Russian urban sociolect. The study is based on the everyday Russian speech corpus One Speakers Day. Specific data were obtained on the analysis of the annotated subcorpus of 289,205 tokens, which includes recorded speech days of 57 men and 48 women, which were the research participants, as well as speech fragments of 87 men and 139 women, which were their interlocutors. Thus, the total number of speakers in the subsample amounts to 144 men and 187 women. The article also begs the question of Data Mining approach usability to the subcorpus and possibilities of further research using machine learning. The results obtained are important for the optimization of speech technologies systems, for theoretical understanding of linguistic processes, as well as for monitoring various social processes taking place in modern Russian metropolis.
first_indexed 2024-12-14T17:10:31Z
format Article
id doaj.art-3125bdbaa8b2482aa7a0c171b8ff3805
institution Directory Open Access Journal
issn 2305-7254
2343-0737
language English
last_indexed 2024-12-14T17:10:31Z
publishDate 2020-09-01
publisher FRUCT
record_format Article
series Proceedings of the XXth Conference of Open Innovations Association FRUCT
spelling doaj.art-3125bdbaa8b2482aa7a0c171b8ff38052022-12-21T22:53:34ZengFRUCTProceedings of the XXth Conference of Open Innovations Association FRUCT2305-72542343-07372020-09-0127228829310.5281/zenodo.4026188Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based StudyNatalia Bogdanova-Beglarian0Olga Blinova1Tatiana Sherstinova2Ekaterina Baeva3Daria Gorbunova4Tatiana Popova5Saint-Petersburg State University, RussiaSaint Petersburg State University, RussiaSt. Petersburg State University, RussiaSt Petersburg state university, RussiaSPBU, RussiaSPBU, RussiaThe paper presents recent results of a multilevel analysis of representative corpus data, conducted in order to identify key speech parameters (lexical, morphological and syntactic) that can diagnose some social/biological characteristics of a speaker or, more broadly, a modern Russian urban sociolect. The study is based on the everyday Russian speech corpus One Speakers Day. Specific data were obtained on the analysis of the annotated subcorpus of 289,205 tokens, which includes recorded speech days of 57 men and 48 women, which were the research participants, as well as speech fragments of 87 men and 139 women, which were their interlocutors. Thus, the total number of speakers in the subsample amounts to 144 men and 187 women. The article also begs the question of Data Mining approach usability to the subcorpus and possibilities of further research using machine learning. The results obtained are important for the optimization of speech technologies systems, for theoretical understanding of linguistic processes, as well as for monitoring various social processes taking place in modern Russian metropolis.https://www.fruct.org/publications/acm27/files/Bog.pdfnlprussianeveryday speechpragmatcssociolinguisticsspeech corpus
spellingShingle Natalia Bogdanova-Beglarian
Olga Blinova
Tatiana Sherstinova
Ekaterina Baeva
Daria Gorbunova
Tatiana Popova
Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study
Proceedings of the XXth Conference of Open Innovations Association FRUCT
nlp
russian
everyday speech
pragmatcs
sociolinguistics
speech corpus
title Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study
title_full Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study
title_fullStr Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study
title_full_unstemmed Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study
title_short Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study
title_sort sociolinguistic variability of russian everyday speech a corpus based study
topic nlp
russian
everyday speech
pragmatcs
sociolinguistics
speech corpus
url https://www.fruct.org/publications/acm27/files/Bog.pdf
work_keys_str_mv AT nataliabogdanovabeglarian sociolinguisticvariabilityofrussianeverydayspeechacorpusbasedstudy
AT olgablinova sociolinguisticvariabilityofrussianeverydayspeechacorpusbasedstudy
AT tatianasherstinova sociolinguisticvariabilityofrussianeverydayspeechacorpusbasedstudy
AT ekaterinabaeva sociolinguisticvariabilityofrussianeverydayspeechacorpusbasedstudy
AT dariagorbunova sociolinguisticvariabilityofrussianeverydayspeechacorpusbasedstudy
AT tatianapopova sociolinguisticvariabilityofrussianeverydayspeechacorpusbasedstudy