Application of the text wave model to the sentiment analysis problem

Authors researched the wave model of text representation which is one of the implementations of distributive semantics. This model takes into account not only the frequency of words occurrence in the text, but also their mutual location. The purpose of the study: to increase the accuracy of the anal...

Full description

Bibliographic Details
Main Authors: Anastasia S. Gruzdeva, Rodion N. Iurev, Igor A. Bessmertny
Format: Article
Language:English
Published: Saint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University) 2022-12-01
Series:Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki
Subjects:
Online Access:https://ntv.ifmo.ru/file/article/21662.pdf
Description
Summary:Authors researched the wave model of text representation which is one of the implementations of distributive semantics. This model takes into account not only the frequency of words occurrence in the text, but also their mutual location. The purpose of the study: to increase the accuracy of the analysis of the tonality of short texts based on the wave model. The method of determining the relationship between the text and the term is based on the calculation of the probability amplitude of the text and term proximity using a wave model. The term with the highest probability amplitude is considered to correspond most closely to the meaning of the text. The wave model allowed taking into account the fact that well-known methods define antonyms as semantically close lexical units. For the experimental study of this technique, a solution to the problem of sentiment analysis was chosen, exactly, finding the correspondence of user reviews about the product to the classes “positive” and “negative”. As a result, the accuracy of the text tonality defining was obtained up to 76.4 %, which exceeds the accuracy of the classical approach as well as the well-known methods of sentiment analysis for the Russian language. In addition, authors detected significant influence on classification accuracy of such model parameters as the choice of a basic distributive semantic model, the choice of a control point for calculating wave numbers, taking into account the influence of antonyms. The presented model has shown high accuracy in identifying the relationships of the text with concepts that are not explicitly present in it and can be successfully used as a mathematical basis for solving problems of sentiment analysis. In addition, the results obtained indicate the potential use of the wave model in other areas that require the classification of texts by indirect signs, for example, to determine the elements of author psychological portrait.
ISSN:2226-1494
2500-0373