A hybrid approach for real-time sentiment analysis & visualization of tweets in Singlish

In the Social Age today, the number of people turning to social media sites to voice their opinions is increasing at a staggering rate. Since there lies such a great amount of sentiment-filled text on Twitter, this arises as a great opportunity to be used as a platform for understanding the public’s...

Full description

Bibliographic Details
Main Author: Lim, Michelle Shi Hui
Other Authors: Erik Cambria
Format: Final Year Project (FYP)
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/76158
Description
Summary:In the Social Age today, the number of people turning to social media sites to voice their opinions is increasing at a staggering rate. Since there lies such a great amount of sentiment-filled text on Twitter, this arises as a great opportunity to be used as a platform for understanding the public’s attitudes on certain topics. Such information can be useful in generating insights in business, social and political contexts, to aid informed decisions. This project therefore aims to create a real-time, interactive dashboard that allows users access to easy-to-read and interactive charts that reflect the public’s sentiments, for any topic they’re interested in, without needing any prior knowledge of sentiment analysis. This is achieved using a web framework which includes the Twitter Streaming API, a MongoDB database, the Flask framework, DC.js charting tool, and a backend sentiment analysis module, which consists of NLP cleaning techniques and the SenticEmoRNTN hybrid sentiment analysis model. This paper also proposes a hybrid sentiment analysis model SenticEmoRNTN that combines three sentiment models: a knowledge-based model SenticNet, a deep-learning based model RNTN and a lexicon-based Emoticon Sentiment Analyzer. In this model, textual sentiment analysis is done through the SenticNet framework by default, and the RNTN model serves as a back-up for when no concepts are matched. Emoticon sentiment analysis is also conducted using lexicons, and then integrated with the textual sentiment values to get an overall value. We find that this achieves a significant 9.2% improvement in F-measure over the original SenticNet framework. Proposed improvements include the training of POS tagger, dependency parsers and RNTN models using Singlish word embeddings.