On the Statistical and Temporal Dynamics of Sentiment Analysis
Despite the broad interest and use of sentiment analysis nowadays, most of the conclusions in current literature are driven by simple statistical representations of sentiment scores. On that basis, the generated sentiment evaluation consists nowadays of encoding and aggregating emotional information...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9063439/ |
_version_ | 1818664046479540224 |
---|---|
author | Margarita Rodriguez-Ibanez Francisco-Javier Gimeno-Blanes Pedro Manuel Cuenca-Jimenez Sergio Munoz-Romero Cristina Soguero Jose Luis Rojo-Alvarez |
author_facet | Margarita Rodriguez-Ibanez Francisco-Javier Gimeno-Blanes Pedro Manuel Cuenca-Jimenez Sergio Munoz-Romero Cristina Soguero Jose Luis Rojo-Alvarez |
author_sort | Margarita Rodriguez-Ibanez |
collection | DOAJ |
description | Despite the broad interest and use of sentiment analysis nowadays, most of the conclusions in current literature are driven by simple statistical representations of sentiment scores. On that basis, the generated sentiment evaluation consists nowadays of encoding and aggregating emotional information from a number of individuals and their populational trends. We hypothesized that the stochastic processes aimed to be measured by sentiment analysis systems will exhibit nontrivial statistical and temporal properties. We established an experimental setup consisting of analyzing the short text messages (tweets) of 6 user groups with different nature (universities, politics, musicians, communication media, technological companies, and financial companies), including in each group ten high-intensity users in their regular generation of traffic on social networks. Statistical descriptors were checked to converge at about 2000 messages for each user, for which messages from the last two weeks were compiled using a custom-made tool. The messages were subsequently processed for sentiment scoring in terms of different lexicons currently available and widely used. Not only the temporal dynamics of the resulting score time series per user was scrutinized, but also its statistical description as given by the score histogram, the temporal autocorrelation, the entropy, and the mutual information. Our results showed that the actual dynamic range of lexicons is in general moderate, and hence not much resolution is given within their end-of-scales. We found that seasonal patterns were more present in the time evolution of the number of tweets, but to a much lesser extent in the sentiment intensity. Additionally, we found that the presence of retweets added negligible effects over standard statistical modes, while it hindered informational and temporal patterns. The innovative Compounded Aggregated Positivity Index developed in this work proved to be characteristic for industries and at the same time an interesting way to identify singularities among peers. We conclude that temporal properties of messages provide with information about the sentiment dynamics, which is different in terms of lexicons and users, but commonalities can be exploited in this field using appropriate temporal digital processing tools. |
first_indexed | 2024-12-17T05:26:31Z |
format | Article |
id | doaj.art-a5cfa69b754f4d648ced8ddcc0c58aef |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-17T05:26:31Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-a5cfa69b754f4d648ced8ddcc0c58aef2022-12-21T22:01:51ZengIEEEIEEE Access2169-35362020-01-018879948801310.1109/ACCESS.2020.29872079063439On the Statistical and Temporal Dynamics of Sentiment AnalysisMargarita Rodriguez-Ibanez0https://orcid.org/0000-0002-6310-7126Francisco-Javier Gimeno-Blanes1Pedro Manuel Cuenca-Jimenez2https://orcid.org/0000-0002-0198-0659Sergio Munoz-Romero3Cristina Soguero4Jose Luis Rojo-Alvarez5https://orcid.org/0000-0003-0426-8912Department of Business, Universidad Rey Juan Carlos, Madrid, SpainDepartment of Communication Engineering, Universidad Miguel Hernández, Elche, SpainDepartment of Signal Theory and Communications, Telematics, and Computing Systems, Universidad Rey Juan Carlos, Madrid, SpainDepartment of Signal Theory and Communications, Telematics, and Computing Systems, Universidad Rey Juan Carlos, Madrid, SpainDepartment of Signal Theory and Communications, Telematics, and Computing Systems, Universidad Rey Juan Carlos, Madrid, SpainDepartment of Signal Theory and Communications, Telematics, and Computing Systems, Universidad Rey Juan Carlos, Madrid, SpainDespite the broad interest and use of sentiment analysis nowadays, most of the conclusions in current literature are driven by simple statistical representations of sentiment scores. On that basis, the generated sentiment evaluation consists nowadays of encoding and aggregating emotional information from a number of individuals and their populational trends. We hypothesized that the stochastic processes aimed to be measured by sentiment analysis systems will exhibit nontrivial statistical and temporal properties. We established an experimental setup consisting of analyzing the short text messages (tweets) of 6 user groups with different nature (universities, politics, musicians, communication media, technological companies, and financial companies), including in each group ten high-intensity users in their regular generation of traffic on social networks. Statistical descriptors were checked to converge at about 2000 messages for each user, for which messages from the last two weeks were compiled using a custom-made tool. The messages were subsequently processed for sentiment scoring in terms of different lexicons currently available and widely used. Not only the temporal dynamics of the resulting score time series per user was scrutinized, but also its statistical description as given by the score histogram, the temporal autocorrelation, the entropy, and the mutual information. Our results showed that the actual dynamic range of lexicons is in general moderate, and hence not much resolution is given within their end-of-scales. We found that seasonal patterns were more present in the time evolution of the number of tweets, but to a much lesser extent in the sentiment intensity. Additionally, we found that the presence of retweets added negligible effects over standard statistical modes, while it hindered informational and temporal patterns. The innovative Compounded Aggregated Positivity Index developed in this work proved to be characteristic for industries and at the same time an interesting way to identify singularities among peers. We conclude that temporal properties of messages provide with information about the sentiment dynamics, which is different in terms of lexicons and users, but commonalities can be exploited in this field using appropriate temporal digital processing tools.https://ieeexplore.ieee.org/document/9063439/Sentiment analysismachine learning techniquessentiment dictionariessocial networkingpublic opinionsTwitter |
spellingShingle | Margarita Rodriguez-Ibanez Francisco-Javier Gimeno-Blanes Pedro Manuel Cuenca-Jimenez Sergio Munoz-Romero Cristina Soguero Jose Luis Rojo-Alvarez On the Statistical and Temporal Dynamics of Sentiment Analysis IEEE Access Sentiment analysis machine learning techniques sentiment dictionaries social networking public opinions |
title | On the Statistical and Temporal Dynamics of Sentiment Analysis |
title_full | On the Statistical and Temporal Dynamics of Sentiment Analysis |
title_fullStr | On the Statistical and Temporal Dynamics of Sentiment Analysis |
title_full_unstemmed | On the Statistical and Temporal Dynamics of Sentiment Analysis |
title_short | On the Statistical and Temporal Dynamics of Sentiment Analysis |
title_sort | on the statistical and temporal dynamics of sentiment analysis |
topic | Sentiment analysis machine learning techniques sentiment dictionaries social networking public opinions |
url | https://ieeexplore.ieee.org/document/9063439/ |
work_keys_str_mv | AT margaritarodriguezibanez onthestatisticalandtemporaldynamicsofsentimentanalysis AT franciscojaviergimenoblanes onthestatisticalandtemporaldynamicsofsentimentanalysis AT pedromanuelcuencajimenez onthestatisticalandtemporaldynamicsofsentimentanalysis AT sergiomunozromero onthestatisticalandtemporaldynamicsofsentimentanalysis AT cristinasoguero onthestatisticalandtemporaldynamicsofsentimentanalysis AT joseluisrojoalvarez onthestatisticalandtemporaldynamicsofsentimentanalysis |