Aspect term extraction based on word embedding
There are many sites in the Internet that allow users to share their opinions and write reviews about all kinds of goods and services. These views may be useful not only for other users, but also for companies which want to track their own reputation and to receive timely feedback on their products...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Ivannikov Institute for System Programming of the Russian Academy of Sciences
2018-10-01
|
Series: | Труды Института системного программирования РАН |
Subjects: | |
Online Access: | https://ispranproceedings.elpub.ru/jour/article/view/215 |
_version_ | 1819025063352991744 |
---|---|
author | D. O. Mashkin E. V. Kotelnikov |
author_facet | D. O. Mashkin E. V. Kotelnikov |
author_sort | D. O. Mashkin |
collection | DOAJ |
description | There are many sites in the Internet that allow users to share their opinions and write reviews about all kinds of goods and services. These views may be useful not only for other users, but also for companies which want to track their own reputation and to receive timely feedback on their products and services. The most detailed statement of the problem in this area is an aspect-based sentiment analysis, which determines the user attitude not only to the object as a whole, but also to its individual aspects. In this paper we consider the solution of subtask of aspect terms extraction in aspect-based sentiment analysis. A review of research in this area is given. The subtask of aspect terms extraction is considered as a problem of sequence labeling; to solve it we apply the model of conditional random fields (CRF). To create the sequence feature description, we use distributed representations of words derived from neural network models for the Russian language and parts of speech of the analyzed words. The stages of the aspect terms extraction software system are shown. The experiments with the developed software system were carried out on the corpus of labeled reviews of restaurants, created in the International Workshop on Semantic Evaluation (SemEval-2016). We describe the dependence of the quality of aspect terms extraction subtask on various neural network models and the variations of feature descriptions. The best results (F1-measure = 69%) are shown by a version of the system, which takes into account the context and the parts of speech. This paper contains a detailed analysis of errors made by the system, as well as suggestions on possible options for their correction. Finally, future research directions are presented. |
first_indexed | 2024-12-21T05:04:43Z |
format | Article |
id | doaj.art-22890a289ac04f2e9d0e6bf5baffbdde |
institution | Directory Open Access Journal |
issn | 2079-8156 2220-6426 |
language | English |
last_indexed | 2024-12-21T05:04:43Z |
publishDate | 2018-10-01 |
publisher | Ivannikov Institute for System Programming of the Russian Academy of Sciences |
record_format | Article |
series | Труды Института системного программирования РАН |
spelling | doaj.art-22890a289ac04f2e9d0e6bf5baffbdde2022-12-21T19:15:09ZengIvannikov Institute for System Programming of the Russian Academy of SciencesТруды Института системного программирования РАН2079-81562220-64262018-10-0128622324010.15514/ISPRAS-2016-28(6)-16215Aspect term extraction based on word embeddingD. O. Mashkin0E. V. Kotelnikov1Вятский государственный университетВятский государственный университетThere are many sites in the Internet that allow users to share their opinions and write reviews about all kinds of goods and services. These views may be useful not only for other users, but also for companies which want to track their own reputation and to receive timely feedback on their products and services. The most detailed statement of the problem in this area is an aspect-based sentiment analysis, which determines the user attitude not only to the object as a whole, but also to its individual aspects. In this paper we consider the solution of subtask of aspect terms extraction in aspect-based sentiment analysis. A review of research in this area is given. The subtask of aspect terms extraction is considered as a problem of sequence labeling; to solve it we apply the model of conditional random fields (CRF). To create the sequence feature description, we use distributed representations of words derived from neural network models for the Russian language and parts of speech of the analyzed words. The stages of the aspect terms extraction software system are shown. The experiments with the developed software system were carried out on the corpus of labeled reviews of restaurants, created in the International Workshop on Semantic Evaluation (SemEval-2016). We describe the dependence of the quality of aspect terms extraction subtask on various neural network models and the variations of feature descriptions. The best results (F1-measure = 69%) are shown by a version of the system, which takes into account the context and the parts of speech. This paper contains a detailed analysis of errors made by the system, as well as suggestions on possible options for their correction. Finally, future research directions are presented.https://ispranproceedings.elpub.ru/jour/article/view/215аспектно-ориентированный анализ тональностиизвлечение аспектных терминовмашинное обучениеразметка последовательностей словвекторное представление словword2vecsemeval 2016 |
spellingShingle | D. O. Mashkin E. V. Kotelnikov Aspect term extraction based on word embedding Труды Института системного программирования РАН аспектно-ориентированный анализ тональности извлечение аспектных терминов машинное обучение разметка последовательностей слов векторное представление слов word2vec semeval 2016 |
title | Aspect term extraction based on word embedding |
title_full | Aspect term extraction based on word embedding |
title_fullStr | Aspect term extraction based on word embedding |
title_full_unstemmed | Aspect term extraction based on word embedding |
title_short | Aspect term extraction based on word embedding |
title_sort | aspect term extraction based on word embedding |
topic | аспектно-ориентированный анализ тональности извлечение аспектных терминов машинное обучение разметка последовательностей слов векторное представление слов word2vec semeval 2016 |
url | https://ispranproceedings.elpub.ru/jour/article/view/215 |
work_keys_str_mv | AT domashkin aspecttermextractionbasedonwordembedding AT evkotelnikov aspecttermextractionbasedonwordembedding |