TEmoX: Classification of Textual Emotion Using Ensemble of Transformers

Textual emotion classification (TxtEC) refers to the classification of emotion expressed by individuals in textual form. The widespread use of the Internet and numerous Web 2.0 applications has emerged in an expeditious growth of textual interactions. However, determining emotion from texts is chall...

Full description

Bibliographic Details
Main Authors: Avishek Das, Mohammed Moshiul Hoque, Omar Sharif, M. Ali Akber Dewan, Nazmul Siddique
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10264097/
_version_ 1797660878345076736
author Avishek Das
Mohammed Moshiul Hoque
Omar Sharif
M. Ali Akber Dewan
Nazmul Siddique
author_facet Avishek Das
Mohammed Moshiul Hoque
Omar Sharif
M. Ali Akber Dewan
Nazmul Siddique
author_sort Avishek Das
collection DOAJ
description Textual emotion classification (TxtEC) refers to the classification of emotion expressed by individuals in textual form. The widespread use of the Internet and numerous Web 2.0 applications has emerged in an expeditious growth of textual interactions. However, determining emotion from texts is challenging due to their unorganized, unstructured, and disordered forms. While research in textual emotion classification has made considerable breakthroughs for high-resource languages, it is yet challenging for low-resource languages like Bengali. This work presents a transformer-based ensemble approach (called TEmoX) to categorize Bengali textual data into six integral emotions: joy, anger, disgust, fear, sadness, and surprise. This research investigates 38 classifier models developed using four machine learning LR, RF, MNB, SVM, three deep-learning CNN, BiLSTM, CNN+BiLSTM, five transformer-based m-BERT, XLM-R, Bangla-BERT-1, Bangla-BERT-2, and Indic-DistilBERT techniques with two ensemble strategies and three embedding techniques. The developed models are trained, tuned, and tested on the three versions of the Bengali emotion text corpus BEmoC-v1, BEmoC-v2, BEmoC-v3. The experimental outcomes reveal that the weighted ensemble of four transformer models En-22: Bangla-BERT-2, XLM-R, Indic-DistilBERT, Bangla-BERT-1 outperforms the baseline models and existing methods by providing the maximum weighted <inline-formula> <tex-math notation="LaTeX">$F1$ </tex-math></inline-formula>-score (80.24&#x0025;) on BEmoC-v3. The dataset, models, and fractions of codes are available at <uri>https://github.com/avishek-018/TEmoX</uri>.
first_indexed 2024-03-11T18:36:14Z
format Article
id doaj.art-cd447683baf24827bca6d4cd7c5a3e0b
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-11T18:36:14Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-cd447683baf24827bca6d4cd7c5a3e0b2023-10-12T23:00:41ZengIEEEIEEE Access2169-35362023-01-011110980310981810.1109/ACCESS.2023.331945510264097TEmoX: Classification of Textual Emotion Using Ensemble of TransformersAvishek Das0https://orcid.org/0000-0002-1589-8322Mohammed Moshiul Hoque1https://orcid.org/0000-0001-8806-708XOmar Sharif2M. Ali Akber Dewan3https://orcid.org/0000-0001-6347-7509Nazmul Siddique4https://orcid.org/0000-0002-0642-2357Department of Computer Science and Engineering, Chittagong University of Engineering and Technology, Chittagong, BangladeshDepartment of Computer Science and Engineering, Chittagong University of Engineering and Technology, Chittagong, BangladeshDepartment of Computer Science and Engineering, Chittagong University of Engineering and Technology, Chittagong, BangladeshSchool of Computing and Information Systems, Faculty of Science and Technology, Athabasca University, Athabasca, CanadaSchool of Computing, Engineering and Intelligent Systems, Ulster University, Londonderry, U.K.Textual emotion classification (TxtEC) refers to the classification of emotion expressed by individuals in textual form. The widespread use of the Internet and numerous Web 2.0 applications has emerged in an expeditious growth of textual interactions. However, determining emotion from texts is challenging due to their unorganized, unstructured, and disordered forms. While research in textual emotion classification has made considerable breakthroughs for high-resource languages, it is yet challenging for low-resource languages like Bengali. This work presents a transformer-based ensemble approach (called TEmoX) to categorize Bengali textual data into six integral emotions: joy, anger, disgust, fear, sadness, and surprise. This research investigates 38 classifier models developed using four machine learning LR, RF, MNB, SVM, three deep-learning CNN, BiLSTM, CNN+BiLSTM, five transformer-based m-BERT, XLM-R, Bangla-BERT-1, Bangla-BERT-2, and Indic-DistilBERT techniques with two ensemble strategies and three embedding techniques. The developed models are trained, tuned, and tested on the three versions of the Bengali emotion text corpus BEmoC-v1, BEmoC-v2, BEmoC-v3. The experimental outcomes reveal that the weighted ensemble of four transformer models En-22: Bangla-BERT-2, XLM-R, Indic-DistilBERT, Bangla-BERT-1 outperforms the baseline models and existing methods by providing the maximum weighted <inline-formula> <tex-math notation="LaTeX">$F1$ </tex-math></inline-formula>-score (80.24&#x0025;) on BEmoC-v3. The dataset, models, and fractions of codes are available at <uri>https://github.com/avishek-018/TEmoX</uri>.https://ieeexplore.ieee.org/document/10264097/Natural language processingtext classificationtextual emotion classificationBengali emotion text corpusensemble of transformers
spellingShingle Avishek Das
Mohammed Moshiul Hoque
Omar Sharif
M. Ali Akber Dewan
Nazmul Siddique
TEmoX: Classification of Textual Emotion Using Ensemble of Transformers
IEEE Access
Natural language processing
text classification
textual emotion classification
Bengali emotion text corpus
ensemble of transformers
title TEmoX: Classification of Textual Emotion Using Ensemble of Transformers
title_full TEmoX: Classification of Textual Emotion Using Ensemble of Transformers
title_fullStr TEmoX: Classification of Textual Emotion Using Ensemble of Transformers
title_full_unstemmed TEmoX: Classification of Textual Emotion Using Ensemble of Transformers
title_short TEmoX: Classification of Textual Emotion Using Ensemble of Transformers
title_sort temox classification of textual emotion using ensemble of transformers
topic Natural language processing
text classification
textual emotion classification
Bengali emotion text corpus
ensemble of transformers
url https://ieeexplore.ieee.org/document/10264097/
work_keys_str_mv AT avishekdas temoxclassificationoftextualemotionusingensembleoftransformers
AT mohammedmoshiulhoque temoxclassificationoftextualemotionusingensembleoftransformers
AT omarsharif temoxclassificationoftextualemotionusingensembleoftransformers
AT maliakberdewan temoxclassificationoftextualemotionusingensembleoftransformers
AT nazmulsiddique temoxclassificationoftextualemotionusingensembleoftransformers