An automatic non-English sentiment lexicon builder using unannotated corpus

Sentiment lexicons in the English language are widely accessible while in many other languages, these resources are extremely deficient. Current techniques and methods for sentiment analysis focus mainly on the English language, whereas other languages are neglected due to lack of resources. In orde...

Full description

Bibliographic Details
Main Authors:	Kaity, Mohammed, Balakrishnan, Vimala
Format:	Article
Published:	Springer Verlag 2019
Subjects:	QA75 Electronic computers. Computer science

_version_	1825722195732594688
author	Kaity, Mohammed Balakrishnan, Vimala
author_facet	Kaity, Mohammed Balakrishnan, Vimala
author_sort	Kaity, Mohammed
collection	UM
description	Sentiment lexicons in the English language are widely accessible while in many other languages, these resources are extremely deficient. Current techniques and methods for sentiment analysis focus mainly on the English language, whereas other languages are neglected due to lack of resources. In order to overcome challenges faced in building non-English lexicons, we propose a language-independent method that automatically builds non-English sentiment lexicons based on currently available English lexicons with an unannotated corpus from the target language. The proposed method will automatically recognize and extract new polarity words from the unannotated corpus based on the initial seed lexicons that are developed by translating three reliable English lexicons. The experimental results from the test datasets confirmed that a developed non-English sentiment lexicon could significantly enhance the performance of non-English sentiment classifications, compared with other methods and lexicons. The developed lexicon in the Arabic language outperformed other commonly used methods for developing non-English lexicons, with an 0.74 F measure. The adopted approach in this study was proven to be language independent and can be implemented in other languages as well. This paper also contributes to understanding the approaches to developing sentiment resources. © 2019, Springer Science+Business Media, LLC, part of Springer Nature.
first_indexed	2024-03-06T06:01:57Z
format	Article
id	um.eprints-24153
institution	Universiti Malaya
last_indexed	2024-03-06T06:01:57Z
publishDate	2019
publisher	Springer Verlag
record_format	dspace
spelling	um.eprints-241532020-04-06T15:27:57Z http://eprints.um.edu.my/24153/ An automatic non-English sentiment lexicon builder using unannotated corpus Kaity, Mohammed Balakrishnan, Vimala QA75 Electronic computers. Computer science Sentiment lexicons in the English language are widely accessible while in many other languages, these resources are extremely deficient. Current techniques and methods for sentiment analysis focus mainly on the English language, whereas other languages are neglected due to lack of resources. In order to overcome challenges faced in building non-English lexicons, we propose a language-independent method that automatically builds non-English sentiment lexicons based on currently available English lexicons with an unannotated corpus from the target language. The proposed method will automatically recognize and extract new polarity words from the unannotated corpus based on the initial seed lexicons that are developed by translating three reliable English lexicons. The experimental results from the test datasets confirmed that a developed non-English sentiment lexicon could significantly enhance the performance of non-English sentiment classifications, compared with other methods and lexicons. The developed lexicon in the Arabic language outperformed other commonly used methods for developing non-English lexicons, with an 0.74 F measure. The adopted approach in this study was proven to be language independent and can be implemented in other languages as well. This paper also contributes to understanding the approaches to developing sentiment resources. © 2019, Springer Science+Business Media, LLC, part of Springer Nature. Springer Verlag 2019 Article PeerReviewed Kaity, Mohammed and Balakrishnan, Vimala (2019) An automatic non-English sentiment lexicon builder using unannotated corpus. The Journal of Supercomputing, 75 (4). pp. 2243-2268. ISSN 0920-8542, DOI https://doi.org/10.1007/s11227-019-02755-3 <https://doi.org/10.1007/s11227-019-02755-3>. https://doi.org/10.1007/s11227-019-02755-3 doi:10.1007/s11227-019-02755-3
spellingShingle	QA75 Electronic computers. Computer science Kaity, Mohammed Balakrishnan, Vimala An automatic non-English sentiment lexicon builder using unannotated corpus
title	An automatic non-English sentiment lexicon builder using unannotated corpus
title_full	An automatic non-English sentiment lexicon builder using unannotated corpus
title_fullStr	An automatic non-English sentiment lexicon builder using unannotated corpus
title_full_unstemmed	An automatic non-English sentiment lexicon builder using unannotated corpus
title_short	An automatic non-English sentiment lexicon builder using unannotated corpus
title_sort	automatic non english sentiment lexicon builder using unannotated corpus
topic	QA75 Electronic computers. Computer science
work_keys_str_mv	AT kaitymohammed anautomaticnonenglishsentimentlexiconbuilderusingunannotatedcorpus AT balakrishnanvimala anautomaticnonenglishsentimentlexiconbuilderusingunannotatedcorpus AT kaitymohammed automaticnonenglishsentimentlexiconbuilderusingunannotatedcorpus AT balakrishnanvimala automaticnonenglishsentimentlexiconbuilderusingunannotatedcorpus

An automatic non-English sentiment lexicon builder using unannotated corpus

Similar Items