Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding

Twitter is one of the popular social media to share information through text with a limit of 280 characters called tweets. Many tweets express the emotions of their users, and many studies have been conducted to detect emotions in tweets. In this paper, we train machine learning models to classify I...

Full description

Bibliographic Details
Main Authors: Heldiansyah, Muhammad Fikri, Winarko, Edi
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2022
Subjects:
Online Access:https://repository.ugm.ac.id/278888/1/Heldiansyah_TK.pdf
_version_ 1826050341113692160
author Heldiansyah, Muhammad Fikri
Winarko, Edi
author_facet Heldiansyah, Muhammad Fikri
Winarko, Edi
author_sort Heldiansyah, Muhammad Fikri
collection UGM
description Twitter is one of the popular social media to share information through text with a limit of 280 characters called tweets. Many tweets express the emotions of their users, and many studies have been conducted to detect emotions in tweets. In this paper, we train machine learning models to classify Indonesian tweets into five emotion class labels: happy, angry, fear, sadness, and love. The models are based on Convolutional Neural Network (CNN) combined with ELMo (Embedding from Language Models, BERT (Bidirectional Encoder Representations from Transformers), or Word2Vec. Based on the experiment result, we conclude that the best CNN model on tweet data without stemming is BERT-CNN, with a macro-averaged f1-score of 72.83%, compared with the ELMo-CNN model, which has a macro-averaged f1-score of 55.69% and Word2Vec-CNN, which has a macro-averaged f1-score of 65.57%.
first_indexed 2024-03-14T00:02:26Z
format Article
id oai:generic.eprints.org:278888
institution Universiti Gadjah Mada
language English
last_indexed 2024-03-14T00:02:26Z
publishDate 2022
publisher Institute of Electrical and Electronics Engineers Inc.
record_format dspace
spelling oai:generic.eprints.org:2788882023-10-19T00:56:32Z https://repository.ugm.ac.id/278888/ Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding Heldiansyah, Muhammad Fikri Winarko, Edi Engineering Twitter is one of the popular social media to share information through text with a limit of 280 characters called tweets. Many tweets express the emotions of their users, and many studies have been conducted to detect emotions in tweets. In this paper, we train machine learning models to classify Indonesian tweets into five emotion class labels: happy, angry, fear, sadness, and love. The models are based on Convolutional Neural Network (CNN) combined with ELMo (Embedding from Language Models, BERT (Bidirectional Encoder Representations from Transformers), or Word2Vec. Based on the experiment result, we conclude that the best CNN model on tweet data without stemming is BERT-CNN, with a macro-averaged f1-score of 72.83%, compared with the ELMo-CNN model, which has a macro-averaged f1-score of 55.69% and Word2Vec-CNN, which has a macro-averaged f1-score of 65.57%. Institute of Electrical and Electronics Engineers Inc. 2022-11-03 Article PeerReviewed application/pdf en https://repository.ugm.ac.id/278888/1/Heldiansyah_TK.pdf Heldiansyah, Muhammad Fikri and Winarko, Edi (2022) Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding. International Conference on Data and Software Engineering (ICoDSE), 2022 (185186). pp. 53-58. ISSN 979-835039705-5 https://ieeexplore.ieee.org/document/9972229 https://doi.org/10.1109/ICODSE56892.2022.9972229
spellingShingle Engineering
Heldiansyah, Muhammad Fikri
Winarko, Edi
Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding
title Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding
title_full Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding
title_fullStr Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding
title_full_unstemmed Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding
title_short Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding
title_sort emotion detection on indonesian tweets using cnn and contextualized word embedding
topic Engineering
url https://repository.ugm.ac.id/278888/1/Heldiansyah_TK.pdf
work_keys_str_mv AT heldiansyahmuhammadfikri emotiondetectiononindonesiantweetsusingcnnandcontextualizedwordembedding
AT winarkoedi emotiondetectiononindonesiantweetsusingcnnandcontextualizedwordembedding