Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism

Background: Due to the increased interest in cryptocurrencies, opinions on cryptocurrency-related topics are shared on news and social media. The enormous amount of sentiment data that is frequently released makes data processing and analytics on such important issues more challenging. In addition,...

Full description

Bibliographic Details
Main Authors: Nur Azmina Mohamad Zamani, Norhaslinda Kamaruddin
Format: Article
Language:English
Published: Universitas Airlangga 2023-11-01
Series:Journal of Information Systems Engineering and Business Intelligence
Online Access:https://e-journal.unair.ac.id/JISEBI/article/view/45778
_version_ 1797401020955885568
author Nur Azmina Mohamad Zamani
Norhaslinda Kamaruddin
author_facet Nur Azmina Mohamad Zamani
Norhaslinda Kamaruddin
author_sort Nur Azmina Mohamad Zamani
collection DOAJ
description Background: Due to the increased interest in cryptocurrencies, opinions on cryptocurrency-related topics are shared on news and social media. The enormous amount of sentiment data that is frequently released makes data processing and analytics on such important issues more challenging. In addition, the present sentiment models in the cryptocurrency domain are primarily focused on English with minimal work on Malay language, further complicating problems. Objective: The performance of the sentiment regression model to forecast sentiment scores for Malay news and tweets is examined in this study. Methods: Malay news headlines and tweets on Bitcoin and Ethereum are used as the input. A hybrid Generalized Autoregressive Pretraining for Language Understanding (XLNet) language model in combination with Bidirectional-Gated Recurrent Unit (Bi-GRU) deep learning model is applied in the proposed sentiment regression implementation. The effectiveness of the proposed sentiment regression model is also investigated using the multi-head self-attention mechanism. Then, a comparison analysis using Bidirectional Encoder Representations from Transformers (BERT) is carried out. Results: The experimental results demonstrate that the number of attention heads is vital in improving the XLNet-GRU sentiment model performance. There are slight improvements of 0.03 in the adjusted R2 values with an average MAE of 0.163 (Malay news) and 0.174 (Malay tweets). In addition, an average RMSE of 0.267 and 0.255 were obtained respectively for Malay news and tweets, which show that the proposed XLNet-GRU sentiment model outperforms the BERT sentiment model with lesser prediction errors. Conclusion: The proposed model contributes to predicting sentiment on cryptocurrency. Moreover, this study also introduced two carefully curated Malay corpora, CryptoSentiNews-Malay and CryptoSentiTweets-Malay, which are extracted from news and tweets, respectively. Further works to enhance Malay news and tweets corpora on cryptocurrency-related issues will be expended with implementing the proposed XLNet Bi-GRU deep learning model for greater financial insight. Keywords: Cryptocurrency, Deep learning model, Malay text, Sentiment analysis, Sentiment regression model
first_indexed 2024-03-09T02:02:54Z
format Article
id doaj.art-7bb6bade272747e18dfc3d577000e376
institution Directory Open Access Journal
issn 2598-6333
2443-2555
language English
last_indexed 2024-03-09T02:02:54Z
publishDate 2023-11-01
publisher Universitas Airlangga
record_format Article
series Journal of Information Systems Engineering and Business Intelligence
spelling doaj.art-7bb6bade272747e18dfc3d577000e3762023-12-08T02:07:43ZengUniversitas AirlanggaJournal of Information Systems Engineering and Business Intelligence2598-63332443-25552023-11-019214716010.20473/jisebi.9.2.147-16043863Crypto-sentiment Detection in Malay Text Using Language Models with an Attention MechanismNur Azmina Mohamad Zamani0https://orcid.org/0000-0001-7376-5379Norhaslinda Kamaruddin1https://orcid.org/0000-0003-0827-2417College of Computing, Informatics and Mathematics, Universiti Teknologi MARA (UiTM), Selangor, MalaysiaInstitute for Big Data Analytics and Artificial Intelligence (IBDAAI), Universiti Teknologi MARA (UiTM), Selangor, MalaysiaBackground: Due to the increased interest in cryptocurrencies, opinions on cryptocurrency-related topics are shared on news and social media. The enormous amount of sentiment data that is frequently released makes data processing and analytics on such important issues more challenging. In addition, the present sentiment models in the cryptocurrency domain are primarily focused on English with minimal work on Malay language, further complicating problems. Objective: The performance of the sentiment regression model to forecast sentiment scores for Malay news and tweets is examined in this study. Methods: Malay news headlines and tweets on Bitcoin and Ethereum are used as the input. A hybrid Generalized Autoregressive Pretraining for Language Understanding (XLNet) language model in combination with Bidirectional-Gated Recurrent Unit (Bi-GRU) deep learning model is applied in the proposed sentiment regression implementation. The effectiveness of the proposed sentiment regression model is also investigated using the multi-head self-attention mechanism. Then, a comparison analysis using Bidirectional Encoder Representations from Transformers (BERT) is carried out. Results: The experimental results demonstrate that the number of attention heads is vital in improving the XLNet-GRU sentiment model performance. There are slight improvements of 0.03 in the adjusted R2 values with an average MAE of 0.163 (Malay news) and 0.174 (Malay tweets). In addition, an average RMSE of 0.267 and 0.255 were obtained respectively for Malay news and tweets, which show that the proposed XLNet-GRU sentiment model outperforms the BERT sentiment model with lesser prediction errors. Conclusion: The proposed model contributes to predicting sentiment on cryptocurrency. Moreover, this study also introduced two carefully curated Malay corpora, CryptoSentiNews-Malay and CryptoSentiTweets-Malay, which are extracted from news and tweets, respectively. Further works to enhance Malay news and tweets corpora on cryptocurrency-related issues will be expended with implementing the proposed XLNet Bi-GRU deep learning model for greater financial insight. Keywords: Cryptocurrency, Deep learning model, Malay text, Sentiment analysis, Sentiment regression modelhttps://e-journal.unair.ac.id/JISEBI/article/view/45778
spellingShingle Nur Azmina Mohamad Zamani
Norhaslinda Kamaruddin
Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism
Journal of Information Systems Engineering and Business Intelligence
title Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism
title_full Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism
title_fullStr Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism
title_full_unstemmed Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism
title_short Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism
title_sort crypto sentiment detection in malay text using language models with an attention mechanism
url https://e-journal.unair.ac.id/JISEBI/article/view/45778
work_keys_str_mv AT nurazminamohamadzamani cryptosentimentdetectioninmalaytextusinglanguagemodelswithanattentionmechanism
AT norhaslindakamaruddin cryptosentimentdetectioninmalaytextusinglanguagemodelswithanattentionmechanism