Aspect term extraction based on word embedding

There are many sites in the Internet that allow users to share their opinions and write reviews about all kinds of goods and services. These views may be useful not only for other users, but also for companies which want to track their own reputation and to receive timely feedback on their products...

Full description

Bibliographic Details
Main Authors: D. O. Mashkin, E. V. Kotelnikov
Format: Article
Language:English
Published: Ivannikov Institute for System Programming of the Russian Academy of Sciences 2018-10-01
Series:Труды Института системного программирования РАН
Subjects:
Online Access:https://ispranproceedings.elpub.ru/jour/article/view/215
_version_ 1819025063352991744
author D. O. Mashkin
E. V. Kotelnikov
author_facet D. O. Mashkin
E. V. Kotelnikov
author_sort D. O. Mashkin
collection DOAJ
description There are many sites in the Internet that allow users to share their opinions and write reviews about all kinds of goods and services. These views may be useful not only for other users, but also for companies which want to track their own reputation and to receive timely feedback on their products and services. The most detailed statement of the problem in this area is an aspect-based sentiment analysis, which determines the user attitude not only to the object as a whole, but also to its individual aspects. In this paper we consider the solution of subtask of aspect terms extraction in aspect-based sentiment analysis. A review of research in this area is given. The subtask of aspect terms extraction is considered as a problem of sequence labeling; to solve it we apply the model of conditional random fields (CRF). To create the sequence feature description, we use distributed representations of words derived from neural network models for the Russian language and parts of speech of the analyzed words. The stages of the aspect terms extraction software system are shown. The experiments with the developed software system were carried out on the corpus of labeled reviews of restaurants, created in the International Workshop on Semantic Evaluation (SemEval-2016). We describe the dependence of the quality of aspect terms extraction subtask on various neural network models and the variations of feature descriptions. The best results (F1-measure = 69%) are shown by a version of the system, which takes into account the context and the parts of speech. This paper contains a detailed analysis of errors made by the system, as well as suggestions on possible options for their correction. Finally, future research directions are presented.
first_indexed 2024-12-21T05:04:43Z
format Article
id doaj.art-22890a289ac04f2e9d0e6bf5baffbdde
institution Directory Open Access Journal
issn 2079-8156
2220-6426
language English
last_indexed 2024-12-21T05:04:43Z
publishDate 2018-10-01
publisher Ivannikov Institute for System Programming of the Russian Academy of Sciences
record_format Article
series Труды Института системного программирования РАН
spelling doaj.art-22890a289ac04f2e9d0e6bf5baffbdde2022-12-21T19:15:09ZengIvannikov Institute for System Programming of the Russian Academy of SciencesТруды Института системного программирования РАН2079-81562220-64262018-10-0128622324010.15514/ISPRAS-2016-28(6)-16215Aspect term extraction based on word embeddingD. O. Mashkin0E. V. Kotelnikov1Вятский государственный университетВятский государственный университетThere are many sites in the Internet that allow users to share their opinions and write reviews about all kinds of goods and services. These views may be useful not only for other users, but also for companies which want to track their own reputation and to receive timely feedback on their products and services. The most detailed statement of the problem in this area is an aspect-based sentiment analysis, which determines the user attitude not only to the object as a whole, but also to its individual aspects. In this paper we consider the solution of subtask of aspect terms extraction in aspect-based sentiment analysis. A review of research in this area is given. The subtask of aspect terms extraction is considered as a problem of sequence labeling; to solve it we apply the model of conditional random fields (CRF). To create the sequence feature description, we use distributed representations of words derived from neural network models for the Russian language and parts of speech of the analyzed words. The stages of the aspect terms extraction software system are shown. The experiments with the developed software system were carried out on the corpus of labeled reviews of restaurants, created in the International Workshop on Semantic Evaluation (SemEval-2016). We describe the dependence of the quality of aspect terms extraction subtask on various neural network models and the variations of feature descriptions. The best results (F1-measure = 69%) are shown by a version of the system, which takes into account the context and the parts of speech. This paper contains a detailed analysis of errors made by the system, as well as suggestions on possible options for their correction. Finally, future research directions are presented.https://ispranproceedings.elpub.ru/jour/article/view/215аспектно-ориентированный анализ тональностиизвлечение аспектных терминовмашинное обучениеразметка последовательностей словвекторное представление словword2vecsemeval 2016
spellingShingle D. O. Mashkin
E. V. Kotelnikov
Aspect term extraction based on word embedding
Труды Института системного программирования РАН
аспектно-ориентированный анализ тональности
извлечение аспектных терминов
машинное обучение
разметка последовательностей слов
векторное представление слов
word2vec
semeval 2016
title Aspect term extraction based on word embedding
title_full Aspect term extraction based on word embedding
title_fullStr Aspect term extraction based on word embedding
title_full_unstemmed Aspect term extraction based on word embedding
title_short Aspect term extraction based on word embedding
title_sort aspect term extraction based on word embedding
topic аспектно-ориентированный анализ тональности
извлечение аспектных терминов
машинное обучение
разметка последовательностей слов
векторное представление слов
word2vec
semeval 2016
url https://ispranproceedings.elpub.ru/jour/article/view/215
work_keys_str_mv AT domashkin aspecttermextractionbasedonwordembedding
AT evkotelnikov aspecttermextractionbasedonwordembedding