ПОДСИСТЕМА АНАЛИЗА ТЕКСТОВ В ПОИСКОВИКЕ ДЛЯ НАЦИОНАЛЬНОГО КОРПУСА ЧУВАШСКОГО ЯЗЫКА

Text analysis subsystem in a search engine is discussed in this paper. At this stage, text analysis subsystem consists of the following features: components of text tokenization; component of separation of sentences in the text; components of morphological analysis of sentences. The following specia...

Full description

Bibliographic Details
Main Authors: Zheltov, P.V., Zheltov, V.P., Gubanov, A.R.
Format: Article
Language:English
Published: Marina Sokolova Publishings 2016-09-01
Series:Russian Linguistic Bulletin
Subjects:
Online Access:http://rulb.org/wp-content/uploads/wpem/pdf_compilations/3(7)/3(7).pdf#page=62
_version_ 1818670937681166336
author Zheltov, P.V.
Zheltov, V.P.
Gubanov, A.R.
author_facet Zheltov, P.V.
Zheltov, V.P.
Gubanov, A.R.
author_sort Zheltov, P.V.
collection DOAJ
description Text analysis subsystem in a search engine is discussed in this paper. At this stage, text analysis subsystem consists of the following features: components of text tokenization; component of separation of sentences in the text; components of morphological analysis of sentences. The following special data structures in the form of a set of classes described in the obtained as a result of operation of search engine components. Text tokenization component converts the text into a set of tokens. To define the rules of tokenization the configuration.
first_indexed 2024-12-17T07:16:03Z
format Article
id doaj.art-525f64d779834d8282c62ddf7c1e25b0
institution Directory Open Access Journal
issn 2313-0288
2411-2968
language English
last_indexed 2024-12-17T07:16:03Z
publishDate 2016-09-01
publisher Marina Sokolova Publishings
record_format Article
series Russian Linguistic Bulletin
spelling doaj.art-525f64d779834d8282c62ddf7c1e25b02022-12-21T21:58:54ZengMarina Sokolova PublishingsRussian Linguistic Bulletin2313-02882411-29682016-09-0120163 (7)616310.18454/RULB.7.367.36ПОДСИСТЕМА АНАЛИЗА ТЕКСТОВ В ПОИСКОВИКЕ ДЛЯ НАЦИОНАЛЬНОГО КОРПУСА ЧУВАШСКОГО ЯЗЫКАZheltov, P.V.0Zheltov, V.P.1Gubanov, A.R.2Chuvash State University named after I.N. UlyanovChuvash State University named after I.N. UlyanovChuvash State University named after I.N. UlyanovText analysis subsystem in a search engine is discussed in this paper. At this stage, text analysis subsystem consists of the following features: components of text tokenization; component of separation of sentences in the text; components of morphological analysis of sentences. The following special data structures in the form of a set of classes described in the obtained as a result of operation of search engine components. Text tokenization component converts the text into a set of tokens. To define the rules of tokenization the configuration.http://rulb.org/wp-content/uploads/wpem/pdf_compilations/3(7)/3(7).pdf#page=62indexingquerytext markuptext corporasearch engineиндексированиезапросразметка текстатекстовый корпуспоисковик
spellingShingle Zheltov, P.V.
Zheltov, V.P.
Gubanov, A.R.
ПОДСИСТЕМА АНАЛИЗА ТЕКСТОВ В ПОИСКОВИКЕ ДЛЯ НАЦИОНАЛЬНОГО КОРПУСА ЧУВАШСКОГО ЯЗЫКА
Russian Linguistic Bulletin
indexing
query
text markup
text corpora
search engine
индексирование
запрос
разметка текста
текстовый корпус
поисковик
title ПОДСИСТЕМА АНАЛИЗА ТЕКСТОВ В ПОИСКОВИКЕ ДЛЯ НАЦИОНАЛЬНОГО КОРПУСА ЧУВАШСКОГО ЯЗЫКА
title_full ПОДСИСТЕМА АНАЛИЗА ТЕКСТОВ В ПОИСКОВИКЕ ДЛЯ НАЦИОНАЛЬНОГО КОРПУСА ЧУВАШСКОГО ЯЗЫКА
title_fullStr ПОДСИСТЕМА АНАЛИЗА ТЕКСТОВ В ПОИСКОВИКЕ ДЛЯ НАЦИОНАЛЬНОГО КОРПУСА ЧУВАШСКОГО ЯЗЫКА
title_full_unstemmed ПОДСИСТЕМА АНАЛИЗА ТЕКСТОВ В ПОИСКОВИКЕ ДЛЯ НАЦИОНАЛЬНОГО КОРПУСА ЧУВАШСКОГО ЯЗЫКА
title_short ПОДСИСТЕМА АНАЛИЗА ТЕКСТОВ В ПОИСКОВИКЕ ДЛЯ НАЦИОНАЛЬНОГО КОРПУСА ЧУВАШСКОГО ЯЗЫКА
title_sort подсистема анализа текстов в поисковике для национального корпуса чувашского языка
topic indexing
query
text markup
text corpora
search engine
индексирование
запрос
разметка текста
текстовый корпус
поисковик
url http://rulb.org/wp-content/uploads/wpem/pdf_compilations/3(7)/3(7).pdf#page=62
work_keys_str_mv AT zheltovpv podsistemaanalizatekstovvpoiskovikedlânacionalʹnogokorpusačuvašskogoâzyka
AT zheltovvp podsistemaanalizatekstovvpoiskovikedlânacionalʹnogokorpusačuvašskogoâzyka
AT gubanovar podsistemaanalizatekstovvpoiskovikedlânacionalʹnogokorpusačuvašskogoâzyka