Text-Based Emotion Recognition in English and Polish for Therapeutic Chatbot

In this article, we present the results of our experiments on sentiment and emotion recognition for English and Polish texts, aiming to work in the context of a therapeutic chatbot. We created a dedicated dataset by adding samples of neutral texts to an existing English-language emotion-labeled corp...

Full description

Bibliographic Details
Main Authors: Artur Zygadło, Marek Kozłowski, Artur Janicki
Format: Article
Language:English
Published: MDPI AG 2021-10-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/21/10146
_version_ 1827678289676730368
author Artur Zygadło
Marek Kozłowski
Artur Janicki
author_facet Artur Zygadło
Marek Kozłowski
Artur Janicki
author_sort Artur Zygadło
collection DOAJ
description In this article, we present the results of our experiments on sentiment and emotion recognition for English and Polish texts, aiming to work in the context of a therapeutic chatbot. We created a dedicated dataset by adding samples of neutral texts to an existing English-language emotion-labeled corpus. Next, using neural machine translation, we developed a Polish version of the English database. A bilingual, parallel corpus created in this way, named CORTEX (CORpus of Translated Emotional teXts), labeled with three sentiment polarity classes and nine emotion classes, was used for experiments on classification. We employed various classifiers: Naïve Bayes, Support Vector Machines, fastText, and BERT. The results obtained were satisfactory: we achieved the best scores for the BERT-based models, which yielded accuracy of over <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>90</mn><mo>%</mo></mrow></semantics></math></inline-formula> for sentiment (3-class) classification and almost <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>80</mn><mo>%</mo></mrow></semantics></math></inline-formula> for emotion (9-class) classification. We compared the results for both languages and discussed the differences. Both the accuracy and the F1-scores for Polish turned out to be slightly inferior to those for English, with the highest difference visible for BERT.
first_indexed 2024-03-10T06:06:15Z
format Article
id doaj.art-2367ffc32566422687116e31ad3d0359
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T06:06:15Z
publishDate 2021-10-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-2367ffc32566422687116e31ad3d03592023-11-22T20:28:29ZengMDPI AGApplied Sciences2076-34172021-10-0111211014610.3390/app112110146Text-Based Emotion Recognition in English and Polish for Therapeutic ChatbotArtur Zygadło0Marek Kozłowski1Artur Janicki2Faculty of Electronics and Information Technology, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, PolandFaculty of Electronics and Information Technology, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, PolandFaculty of Electronics and Information Technology, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, PolandIn this article, we present the results of our experiments on sentiment and emotion recognition for English and Polish texts, aiming to work in the context of a therapeutic chatbot. We created a dedicated dataset by adding samples of neutral texts to an existing English-language emotion-labeled corpus. Next, using neural machine translation, we developed a Polish version of the English database. A bilingual, parallel corpus created in this way, named CORTEX (CORpus of Translated Emotional teXts), labeled with three sentiment polarity classes and nine emotion classes, was used for experiments on classification. We employed various classifiers: Naïve Bayes, Support Vector Machines, fastText, and BERT. The results obtained were satisfactory: we achieved the best scores for the BERT-based models, which yielded accuracy of over <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>90</mn><mo>%</mo></mrow></semantics></math></inline-formula> for sentiment (3-class) classification and almost <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>80</mn><mo>%</mo></mrow></semantics></math></inline-formula> for emotion (9-class) classification. We compared the results for both languages and discussed the differences. Both the accuracy and the F1-scores for Polish turned out to be slightly inferior to those for English, with the highest difference visible for BERT.https://www.mdpi.com/2076-3417/11/21/10146human-machine interactionchatbotsentiment recognitionemotion recognitionPolish languageparallel text corpus
spellingShingle Artur Zygadło
Marek Kozłowski
Artur Janicki
Text-Based Emotion Recognition in English and Polish for Therapeutic Chatbot
Applied Sciences
human-machine interaction
chatbot
sentiment recognition
emotion recognition
Polish language
parallel text corpus
title Text-Based Emotion Recognition in English and Polish for Therapeutic Chatbot
title_full Text-Based Emotion Recognition in English and Polish for Therapeutic Chatbot
title_fullStr Text-Based Emotion Recognition in English and Polish for Therapeutic Chatbot
title_full_unstemmed Text-Based Emotion Recognition in English and Polish for Therapeutic Chatbot
title_short Text-Based Emotion Recognition in English and Polish for Therapeutic Chatbot
title_sort text based emotion recognition in english and polish for therapeutic chatbot
topic human-machine interaction
chatbot
sentiment recognition
emotion recognition
Polish language
parallel text corpus
url https://www.mdpi.com/2076-3417/11/21/10146
work_keys_str_mv AT arturzygadło textbasedemotionrecognitioninenglishandpolishfortherapeuticchatbot
AT marekkozłowski textbasedemotionrecognitioninenglishandpolishfortherapeuticchatbot
AT arturjanicki textbasedemotionrecognitioninenglishandpolishfortherapeuticchatbot