DOES GOOGLE TRANSLATE AFFECT LEXICON-BASED SENTIMENT ANALYSIS OF MALAY SOCIAL MEDIA TEXT?

There are a lot of sentiment resources for English, however, there are limited resources in a resource-poor language like the Malay language. One approach to improving sentiment analysis is to translate the focus-language text to a resource-rich language such as English by using Machine Translation...

Full description

Bibliographic Details
Main Authors: Vanessa Enjop, Rosanita Adnan, Nursuriati Jamil, Sanizah Ahmad, Zarina Zainol, Siti Arpah Ahmad
Format: Article
Language:English
Published: UiTM Press 2022-10-01
Series:Malaysian Journal of Computing
Subjects:
Online Access:https://mjoc.uitm.edu.my/main/images/journal/vol7-2-2022/13_Adnan_et_al.pdf
_version_ 1827635893280702464
author Vanessa Enjop
Rosanita Adnan
Nursuriati Jamil
Sanizah Ahmad
Zarina Zainol
Siti Arpah Ahmad
author_facet Vanessa Enjop
Rosanita Adnan
Nursuriati Jamil
Sanizah Ahmad
Zarina Zainol
Siti Arpah Ahmad
author_sort Vanessa Enjop
collection DOAJ
description There are a lot of sentiment resources for English, however, there are limited resources in a resource-poor language like the Malay language. One approach to improving sentiment analysis is to translate the focus-language text to a resource-rich language such as English by using Machine Translation (MT). However, when text is translated from one language into another, sentiment is preserved to varying degrees. The objective of this paper is to assess the performance of MT in Google Translate towards sentiment analysis of Malay social media text on Facebook pages of a caregiver of a person with autism. A total of 3,525 Facebook comments in the Malay language were gathered from May to October 2020. The comments were manually translated to English to create dataset_manual. Google Translate was used to automatically translate the Malay comments into English creating dataset_auto. The sentiment polarity of each comment was labeled as a ground truth dataset. A lexicon-based approach was used to extract sentiment from both dataset_manual and dataset_auto to determine the sentiment polarity. Results show that 65.9% of sentiment analysis using dataset_auto significantly reduces sentiment analysis. The sentiment expressions are often mistranslated into neutral expressions when translated. Meanwhile, sentiment analysis using dataset_manual was still able to capture the sentiment of Facebook comment without taking the comment out of context where 92.5% shows positive sentiment towards comments related to autism spectrum disorder.
first_indexed 2024-03-09T15:33:44Z
format Article
id doaj.art-0c5d59f556d64b1489e28d82bf4298fe
institution Directory Open Access Journal
issn 2600-8238
language English
last_indexed 2024-03-09T15:33:44Z
publishDate 2022-10-01
publisher UiTM Press
record_format Article
series Malaysian Journal of Computing
spelling doaj.art-0c5d59f556d64b1489e28d82bf4298fe2023-11-26T10:27:23ZengUiTM PressMalaysian Journal of Computing2600-82382022-10-01721236124910.24191/mjoc.v7i2.19486DOES GOOGLE TRANSLATE AFFECT LEXICON-BASED SENTIMENT ANALYSIS OF MALAY SOCIAL MEDIA TEXT?Vanessa Enjop0Rosanita Adnan1Nursuriati Jamil2Sanizah Ahmad3Zarina Zainol4Siti Arpah Ahmad5National Autism Resource Centre, Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, 40450 Shah Alam Selangor, MalaysiaNational Autism Resource Centre, Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, 40450 Shah Alam Selangor, MalaysiaNational Autism Resource Centre, Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, 40450 Shah Alam Selangor, MalaysiaNational Autism Resource Centre, Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, 40450 Shah Alam Selangor, MalaysiaNational Autism Resource Centre, Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, 40450 Shah Alam Selangor, MalaysiaNational Autism Resource Centre, Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, 40450 Shah Alam Selangor, MalaysiaThere are a lot of sentiment resources for English, however, there are limited resources in a resource-poor language like the Malay language. One approach to improving sentiment analysis is to translate the focus-language text to a resource-rich language such as English by using Machine Translation (MT). However, when text is translated from one language into another, sentiment is preserved to varying degrees. The objective of this paper is to assess the performance of MT in Google Translate towards sentiment analysis of Malay social media text on Facebook pages of a caregiver of a person with autism. A total of 3,525 Facebook comments in the Malay language were gathered from May to October 2020. The comments were manually translated to English to create dataset_manual. Google Translate was used to automatically translate the Malay comments into English creating dataset_auto. The sentiment polarity of each comment was labeled as a ground truth dataset. A lexicon-based approach was used to extract sentiment from both dataset_manual and dataset_auto to determine the sentiment polarity. Results show that 65.9% of sentiment analysis using dataset_auto significantly reduces sentiment analysis. The sentiment expressions are often mistranslated into neutral expressions when translated. Meanwhile, sentiment analysis using dataset_manual was still able to capture the sentiment of Facebook comment without taking the comment out of context where 92.5% shows positive sentiment towards comments related to autism spectrum disorder.https://mjoc.uitm.edu.my/main/images/journal/vol7-2-2022/13_Adnan_et_al.pdffacebookgoogle translatelexicon-basedmachine translationsentiment analysis
spellingShingle Vanessa Enjop
Rosanita Adnan
Nursuriati Jamil
Sanizah Ahmad
Zarina Zainol
Siti Arpah Ahmad
DOES GOOGLE TRANSLATE AFFECT LEXICON-BASED SENTIMENT ANALYSIS OF MALAY SOCIAL MEDIA TEXT?
Malaysian Journal of Computing
facebook
google translate
lexicon-based
machine translation
sentiment analysis
title DOES GOOGLE TRANSLATE AFFECT LEXICON-BASED SENTIMENT ANALYSIS OF MALAY SOCIAL MEDIA TEXT?
title_full DOES GOOGLE TRANSLATE AFFECT LEXICON-BASED SENTIMENT ANALYSIS OF MALAY SOCIAL MEDIA TEXT?
title_fullStr DOES GOOGLE TRANSLATE AFFECT LEXICON-BASED SENTIMENT ANALYSIS OF MALAY SOCIAL MEDIA TEXT?
title_full_unstemmed DOES GOOGLE TRANSLATE AFFECT LEXICON-BASED SENTIMENT ANALYSIS OF MALAY SOCIAL MEDIA TEXT?
title_short DOES GOOGLE TRANSLATE AFFECT LEXICON-BASED SENTIMENT ANALYSIS OF MALAY SOCIAL MEDIA TEXT?
title_sort does google translate affect lexicon based sentiment analysis of malay social media text
topic facebook
google translate
lexicon-based
machine translation
sentiment analysis
url https://mjoc.uitm.edu.my/main/images/journal/vol7-2-2022/13_Adnan_et_al.pdf
work_keys_str_mv AT vanessaenjop doesgoogletranslateaffectlexiconbasedsentimentanalysisofmalaysocialmediatext
AT rosanitaadnan doesgoogletranslateaffectlexiconbasedsentimentanalysisofmalaysocialmediatext
AT nursuriatijamil doesgoogletranslateaffectlexiconbasedsentimentanalysisofmalaysocialmediatext
AT sanizahahmad doesgoogletranslateaffectlexiconbasedsentimentanalysisofmalaysocialmediatext
AT zarinazainol doesgoogletranslateaffectlexiconbasedsentimentanalysisofmalaysocialmediatext
AT sitiarpahahmad doesgoogletranslateaffectlexiconbasedsentimentanalysisofmalaysocialmediatext