Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding
Twitter is one of the popular social media to share information through text with a limit of 280 characters called tweets. Many tweets express the emotions of their users, and many studies have been conducted to detect emotions in tweets. In this paper, we train machine learning models to classify I...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Institute of Electrical and Electronics Engineers Inc.
2022
|
Subjects: | |
Online Access: | https://repository.ugm.ac.id/278888/1/Heldiansyah_TK.pdf |
_version_ | 1826050341113692160 |
---|---|
author | Heldiansyah, Muhammad Fikri Winarko, Edi |
author_facet | Heldiansyah, Muhammad Fikri Winarko, Edi |
author_sort | Heldiansyah, Muhammad Fikri |
collection | UGM |
description | Twitter is one of the popular social media to share information through text with a limit of 280 characters called tweets. Many tweets express the emotions of their users, and many studies have been conducted to detect emotions in tweets. In this paper, we train machine learning models to classify Indonesian tweets into five emotion class labels: happy, angry, fear, sadness, and love. The models are based on Convolutional Neural Network (CNN) combined with ELMo (Embedding from Language Models, BERT (Bidirectional Encoder Representations from Transformers), or Word2Vec. Based on the experiment result, we conclude that the best CNN model on tweet data without stemming is BERT-CNN, with a macro-averaged f1-score of 72.83%, compared with the ELMo-CNN model, which has a macro-averaged f1-score of 55.69% and Word2Vec-CNN, which has a macro-averaged f1-score of 65.57%. |
first_indexed | 2024-03-14T00:02:26Z |
format | Article |
id | oai:generic.eprints.org:278888 |
institution | Universiti Gadjah Mada |
language | English |
last_indexed | 2024-03-14T00:02:26Z |
publishDate | 2022 |
publisher | Institute of Electrical and Electronics Engineers Inc. |
record_format | dspace |
spelling | oai:generic.eprints.org:2788882023-10-19T00:56:32Z https://repository.ugm.ac.id/278888/ Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding Heldiansyah, Muhammad Fikri Winarko, Edi Engineering Twitter is one of the popular social media to share information through text with a limit of 280 characters called tweets. Many tweets express the emotions of their users, and many studies have been conducted to detect emotions in tweets. In this paper, we train machine learning models to classify Indonesian tweets into five emotion class labels: happy, angry, fear, sadness, and love. The models are based on Convolutional Neural Network (CNN) combined with ELMo (Embedding from Language Models, BERT (Bidirectional Encoder Representations from Transformers), or Word2Vec. Based on the experiment result, we conclude that the best CNN model on tweet data without stemming is BERT-CNN, with a macro-averaged f1-score of 72.83%, compared with the ELMo-CNN model, which has a macro-averaged f1-score of 55.69% and Word2Vec-CNN, which has a macro-averaged f1-score of 65.57%. Institute of Electrical and Electronics Engineers Inc. 2022-11-03 Article PeerReviewed application/pdf en https://repository.ugm.ac.id/278888/1/Heldiansyah_TK.pdf Heldiansyah, Muhammad Fikri and Winarko, Edi (2022) Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding. International Conference on Data and Software Engineering (ICoDSE), 2022 (185186). pp. 53-58. ISSN 979-835039705-5 https://ieeexplore.ieee.org/document/9972229 https://doi.org/10.1109/ICODSE56892.2022.9972229 |
spellingShingle | Engineering Heldiansyah, Muhammad Fikri Winarko, Edi Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding |
title | Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding |
title_full | Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding |
title_fullStr | Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding |
title_full_unstemmed | Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding |
title_short | Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding |
title_sort | emotion detection on indonesian tweets using cnn and contextualized word embedding |
topic | Engineering |
url | https://repository.ugm.ac.id/278888/1/Heldiansyah_TK.pdf |
work_keys_str_mv | AT heldiansyahmuhammadfikri emotiondetectiononindonesiantweetsusingcnnandcontextualizedwordembedding AT winarkoedi emotiondetectiononindonesiantweetsusingcnnandcontextualizedwordembedding |