Early and Late Fusion of Emojis and Text to Enhance Opinion Mining

Opinion mining has gained increasing importance to draw insights from social media content to support decision making. Despite the explosive growth of efforts on linguistic analysis to detect and track people’s opinions, more specifically when dealing with aspect-level opinion mining, the...

Full description

Bibliographic Details
Main Authors: Sadam Al-Azani, El-Sayed M. El-Alfy
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9525070/
Description
Summary:Opinion mining has gained increasing importance to draw insights from social media content to support decision making. Despite the explosive growth of efforts on linguistic analysis to detect and track people’s opinions, more specifically when dealing with aspect-level opinion mining, the results are still away from generalization to real-world applications. Nowadays, emojis are getting excessively popular in social media communication as a complementary way to quickly express opinions and ideas in a visual manner. Two emoji-related issues are highlighted: ambiguity and misinterpretation of emojis’ sentiment and tendency of persons to adopt emojis more in positive cases. This paper aims at investigating to what extent the usage of emojis can contribute to the automated detection of sentiment polarity of text messages with focus on Twitter posts in the Arabic language, a widely spoken language but has complex morphology and limited reliable resources for sentiment analysis. For this purpose, after an extensive review of the state-of-the-art of emojis-related work, a dataset is composed and several feature extraction methods are applied for both text and emojis modalities. Moreover, various early and late fusion techniques are proposed to combine both modalities at different levels including feature, score, decision and hybrid. The experimental results revealed that emojis features can significantly improve the classification results, especially when integrated with text at the score level.
ISSN:2169-3536