Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism
Background: Due to the increased interest in cryptocurrencies, opinions on cryptocurrency-related topics are shared on news and social media. The enormous amount of sentiment data that is frequently released makes data processing and analytics on such important issues more challenging. In addition,...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Universitas Airlangga
2023-11-01
|
Series: | Journal of Information Systems Engineering and Business Intelligence |
Online Access: | https://e-journal.unair.ac.id/JISEBI/article/view/45778 |
_version_ | 1797401020955885568 |
---|---|
author | Nur Azmina Mohamad Zamani Norhaslinda Kamaruddin |
author_facet | Nur Azmina Mohamad Zamani Norhaslinda Kamaruddin |
author_sort | Nur Azmina Mohamad Zamani |
collection | DOAJ |
description | Background: Due to the increased interest in cryptocurrencies, opinions on cryptocurrency-related topics are shared on news and social media. The enormous amount of sentiment data that is frequently released makes data processing and analytics on such important issues more challenging. In addition, the present sentiment models in the cryptocurrency domain are primarily focused on English with minimal work on Malay language, further complicating problems.
Objective: The performance of the sentiment regression model to forecast sentiment scores for Malay news and tweets is examined in this study.
Methods: Malay news headlines and tweets on Bitcoin and Ethereum are used as the input. A hybrid Generalized Autoregressive Pretraining for Language Understanding (XLNet) language model in combination with Bidirectional-Gated Recurrent Unit (Bi-GRU) deep learning model is applied in the proposed sentiment regression implementation. The effectiveness of the proposed sentiment regression model is also investigated using the multi-head self-attention mechanism. Then, a comparison analysis using Bidirectional Encoder Representations from Transformers (BERT) is carried out.
Results: The experimental results demonstrate that the number of attention heads is vital in improving the XLNet-GRU sentiment model performance. There are slight improvements of 0.03 in the adjusted R2 values with an average MAE of 0.163 (Malay news) and 0.174 (Malay tweets). In addition, an average RMSE of 0.267 and 0.255 were obtained respectively for Malay news and tweets, which show that the proposed XLNet-GRU sentiment model outperforms the BERT sentiment model with lesser prediction errors.
Conclusion: The proposed model contributes to predicting sentiment on cryptocurrency. Moreover, this study also introduced two carefully curated Malay corpora, CryptoSentiNews-Malay and CryptoSentiTweets-Malay, which are extracted from news and tweets, respectively. Further works to enhance Malay news and tweets corpora on cryptocurrency-related issues will be expended with implementing the proposed XLNet Bi-GRU deep learning model for greater financial insight.
Keywords: Cryptocurrency, Deep learning model, Malay text, Sentiment analysis, Sentiment regression model |
first_indexed | 2024-03-09T02:02:54Z |
format | Article |
id | doaj.art-7bb6bade272747e18dfc3d577000e376 |
institution | Directory Open Access Journal |
issn | 2598-6333 2443-2555 |
language | English |
last_indexed | 2024-03-09T02:02:54Z |
publishDate | 2023-11-01 |
publisher | Universitas Airlangga |
record_format | Article |
series | Journal of Information Systems Engineering and Business Intelligence |
spelling | doaj.art-7bb6bade272747e18dfc3d577000e3762023-12-08T02:07:43ZengUniversitas AirlanggaJournal of Information Systems Engineering and Business Intelligence2598-63332443-25552023-11-019214716010.20473/jisebi.9.2.147-16043863Crypto-sentiment Detection in Malay Text Using Language Models with an Attention MechanismNur Azmina Mohamad Zamani0https://orcid.org/0000-0001-7376-5379Norhaslinda Kamaruddin1https://orcid.org/0000-0003-0827-2417College of Computing, Informatics and Mathematics, Universiti Teknologi MARA (UiTM), Selangor, MalaysiaInstitute for Big Data Analytics and Artificial Intelligence (IBDAAI), Universiti Teknologi MARA (UiTM), Selangor, MalaysiaBackground: Due to the increased interest in cryptocurrencies, opinions on cryptocurrency-related topics are shared on news and social media. The enormous amount of sentiment data that is frequently released makes data processing and analytics on such important issues more challenging. In addition, the present sentiment models in the cryptocurrency domain are primarily focused on English with minimal work on Malay language, further complicating problems. Objective: The performance of the sentiment regression model to forecast sentiment scores for Malay news and tweets is examined in this study. Methods: Malay news headlines and tweets on Bitcoin and Ethereum are used as the input. A hybrid Generalized Autoregressive Pretraining for Language Understanding (XLNet) language model in combination with Bidirectional-Gated Recurrent Unit (Bi-GRU) deep learning model is applied in the proposed sentiment regression implementation. The effectiveness of the proposed sentiment regression model is also investigated using the multi-head self-attention mechanism. Then, a comparison analysis using Bidirectional Encoder Representations from Transformers (BERT) is carried out. Results: The experimental results demonstrate that the number of attention heads is vital in improving the XLNet-GRU sentiment model performance. There are slight improvements of 0.03 in the adjusted R2 values with an average MAE of 0.163 (Malay news) and 0.174 (Malay tweets). In addition, an average RMSE of 0.267 and 0.255 were obtained respectively for Malay news and tweets, which show that the proposed XLNet-GRU sentiment model outperforms the BERT sentiment model with lesser prediction errors. Conclusion: The proposed model contributes to predicting sentiment on cryptocurrency. Moreover, this study also introduced two carefully curated Malay corpora, CryptoSentiNews-Malay and CryptoSentiTweets-Malay, which are extracted from news and tweets, respectively. Further works to enhance Malay news and tweets corpora on cryptocurrency-related issues will be expended with implementing the proposed XLNet Bi-GRU deep learning model for greater financial insight. Keywords: Cryptocurrency, Deep learning model, Malay text, Sentiment analysis, Sentiment regression modelhttps://e-journal.unair.ac.id/JISEBI/article/view/45778 |
spellingShingle | Nur Azmina Mohamad Zamani Norhaslinda Kamaruddin Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism Journal of Information Systems Engineering and Business Intelligence |
title | Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism |
title_full | Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism |
title_fullStr | Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism |
title_full_unstemmed | Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism |
title_short | Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism |
title_sort | crypto sentiment detection in malay text using language models with an attention mechanism |
url | https://e-journal.unair.ac.id/JISEBI/article/view/45778 |
work_keys_str_mv | AT nurazminamohamadzamani cryptosentimentdetectioninmalaytextusinglanguagemodelswithanattentionmechanism AT norhaslindakamaruddin cryptosentimentdetectioninmalaytextusinglanguagemodelswithanattentionmechanism |