Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani

A text classifier model optimized for short snippets like tweets is developed to make bilingual sentiment analysis possible. The two languages explored are Bahasa Malaysia and English, since they are the two most commonly spoken languages in Malaysia. The classifier model is trained and tested on a...

Full description

Bibliographic Details
Main Author: Abdullah Sani, Aidil Amirul Safwan
Format: Thesis
Language:English
Published: 2020
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/31488/1/31488.pdf
_version_ 1796903254949363712
author Abdullah Sani, Aidil Amirul Safwan
author_facet Abdullah Sani, Aidil Amirul Safwan
author_sort Abdullah Sani, Aidil Amirul Safwan
collection UITM
description A text classifier model optimized for short snippets like tweets is developed to make bilingual sentiment analysis possible. The two languages explored are Bahasa Malaysia and English, since they are the two most commonly spoken languages in Malaysia. The classifier model is trained and tested on a huge multi domain dataset pre-labelled with the labels “0” and “1”, which resemble “positive” and “negative” respectively. Naïve Bayes ML technique is used as the core of the classifier model. The data are all pre-processed, and once the development of the classifier model is done, the model is run on real-time data, which are public tweets directly or indirectly mentioned to the three biggest CSP in Malaysia, which are Celcom, Maxis and Digi in the year of 2018. The result of the analysis is incorporated into a web application built on Bootstrap on top of Python’s Flask allowing interactive data visualization. Agile methodology is used throughout the development of the application to ensure that this project is done according to the guideline prepared in the design phase. Functionality testing is also done to ensure that there is no significant error that will render the application useless. In conclusion, the findings gathered show that Naïve Bayes is fairly suitable to be used in NLP problems. The future work that can be put into this project is to improve the corpus to include different slangs of Bahasa Malaysia and commonly used short forms as well as adding an extra class to represent texts that do not belong to either “positive” or “negative”.
first_indexed 2024-03-06T02:17:17Z
format Thesis
id oai:ir.uitm.edu.my:31488
institution Universiti Teknologi MARA
language English
last_indexed 2024-03-06T02:17:17Z
publishDate 2020
record_format dspace
spelling oai:ir.uitm.edu.my:314882020-06-26T04:18:05Z https://ir.uitm.edu.my/id/eprint/31488/ Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani Abdullah Sani, Aidil Amirul Safwan Social groups. Group dynamics Twitter Communication. Mass media A text classifier model optimized for short snippets like tweets is developed to make bilingual sentiment analysis possible. The two languages explored are Bahasa Malaysia and English, since they are the two most commonly spoken languages in Malaysia. The classifier model is trained and tested on a huge multi domain dataset pre-labelled with the labels “0” and “1”, which resemble “positive” and “negative” respectively. Naïve Bayes ML technique is used as the core of the classifier model. The data are all pre-processed, and once the development of the classifier model is done, the model is run on real-time data, which are public tweets directly or indirectly mentioned to the three biggest CSP in Malaysia, which are Celcom, Maxis and Digi in the year of 2018. The result of the analysis is incorporated into a web application built on Bootstrap on top of Python’s Flask allowing interactive data visualization. Agile methodology is used throughout the development of the application to ensure that this project is done according to the guideline prepared in the design phase. Functionality testing is also done to ensure that there is no significant error that will render the application useless. In conclusion, the findings gathered show that Naïve Bayes is fairly suitable to be used in NLP problems. The future work that can be put into this project is to improve the corpus to include different slangs of Bahasa Malaysia and commonly used short forms as well as adding an extra class to represent texts that do not belong to either “positive” or “negative”. 2020 Thesis NonPeerReviewed text en https://ir.uitm.edu.my/id/eprint/31488/1/31488.pdf Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani. (2020) Degree thesis, thesis, Universiti Teknologi MARA, Cawangan Melaka. <http://terminalib.uitm.edu.my/31488.pdf>
spellingShingle Social groups. Group dynamics
Twitter
Communication. Mass media
Abdullah Sani, Aidil Amirul Safwan
Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani
title Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani
title_full Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani
title_fullStr Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani
title_full_unstemmed Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani
title_short Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani
title_sort visualizing the reputation of malaysian communication service providers through twitter sentiment analysis using naive bayes aidil amirul safwan abdullah sani
topic Social groups. Group dynamics
Twitter
Communication. Mass media
url https://ir.uitm.edu.my/id/eprint/31488/1/31488.pdf
work_keys_str_mv AT abdullahsaniaidilamirulsafwan visualizingthereputationofmalaysiancommunicationserviceprovidersthroughtwittersentimentanalysisusingnaivebayesaidilamirulsafwanabdullahsani