On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English Datasets

In recent years, machine learning approaches, in particular graph learning methods, have achieved great results in the field of natural language processing, in particular text classification tasks. However, many of such models have shown limited generalization on datasets in different languages. In...

Full description

Bibliographic Details
Main Authors: Fatemeh Gholami, Zahed Rahmati, Alireza Mofidi, Mostafa Abbaszadeh
Format: Article
Language:English
Published: MDPI AG 2023-10-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/16/10/470
_version_ 1797575018687758336
author Fatemeh Gholami
Zahed Rahmati
Alireza Mofidi
Mostafa Abbaszadeh
author_facet Fatemeh Gholami
Zahed Rahmati
Alireza Mofidi
Mostafa Abbaszadeh
author_sort Fatemeh Gholami
collection DOAJ
description In recent years, machine learning approaches, in particular graph learning methods, have achieved great results in the field of natural language processing, in particular text classification tasks. However, many of such models have shown limited generalization on datasets in different languages. In this research, we investigate and elaborate graph machine learning methods on non-English datasets (such as the Persian Digikala dataset), which consists of users’ opinions for the task of text classification. More specifically, we investigate different combinations of (Pars) BERT with various graph neural network (GNN) architectures (such as GCN, GAT, and GIN) as well as use ensemble learning methods in order to tackle the text classification task on certain well-known non-English datasets. Our analysis and results demonstrate how applying GNN models helps in achieving good scores on the task of text classification by better capturing the topological information between textual data. Additionally, our experiments show how models employing language-specific pre-trained models (like ParsBERT, instead of BERT) capture better information about the data, resulting in better accuracies.
first_indexed 2024-03-10T21:30:24Z
format Article
id doaj.art-75da903f6eb54fad85b81a4c0f2f7620
institution Directory Open Access Journal
issn 1999-4893
language English
last_indexed 2024-03-10T21:30:24Z
publishDate 2023-10-01
publisher MDPI AG
record_format Article
series Algorithms
spelling doaj.art-75da903f6eb54fad85b81a4c0f2f76202023-11-19T15:23:37ZengMDPI AGAlgorithms1999-48932023-10-01161047010.3390/a16100470On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English DatasetsFatemeh Gholami0Zahed Rahmati1Alireza Mofidi2Mostafa Abbaszadeh3Department of Mathematics and Computer Science, Amirkabir University of Technology (Tehran Polytechnic), Tehran 15916-39675, IranDepartment of Mathematics and Computer Science, Amirkabir University of Technology (Tehran Polytechnic), Tehran 15916-39675, IranDepartment of Mathematics and Computer Science, Amirkabir University of Technology (Tehran Polytechnic), Tehran 15916-39675, IranDepartment of Mathematics and Computer Science, Amirkabir University of Technology (Tehran Polytechnic), Tehran 15916-39675, IranIn recent years, machine learning approaches, in particular graph learning methods, have achieved great results in the field of natural language processing, in particular text classification tasks. However, many of such models have shown limited generalization on datasets in different languages. In this research, we investigate and elaborate graph machine learning methods on non-English datasets (such as the Persian Digikala dataset), which consists of users’ opinions for the task of text classification. More specifically, we investigate different combinations of (Pars) BERT with various graph neural network (GNN) architectures (such as GCN, GAT, and GIN) as well as use ensemble learning methods in order to tackle the text classification task on certain well-known non-English datasets. Our analysis and results demonstrate how applying GNN models helps in achieving good scores on the task of text classification by better capturing the topological information between textual data. Additionally, our experiments show how models employing language-specific pre-trained models (like ParsBERT, instead of BERT) capture better information about the data, resulting in better accuracies.https://www.mdpi.com/1999-4893/16/10/470non-English text classificationgraph machine learningensemble learning method(Pars) BERT
spellingShingle Fatemeh Gholami
Zahed Rahmati
Alireza Mofidi
Mostafa Abbaszadeh
On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English Datasets
Algorithms
non-English text classification
graph machine learning
ensemble learning method
(Pars) BERT
title On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English Datasets
title_full On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English Datasets
title_fullStr On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English Datasets
title_full_unstemmed On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English Datasets
title_short On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English Datasets
title_sort on enhancement of text classification and analysis of text emotions using graph machine learning and ensemble learning methods on non english datasets
topic non-English text classification
graph machine learning
ensemble learning method
(Pars) BERT
url https://www.mdpi.com/1999-4893/16/10/470
work_keys_str_mv AT fatemehgholami onenhancementoftextclassificationandanalysisoftextemotionsusinggraphmachinelearningandensemblelearningmethodsonnonenglishdatasets
AT zahedrahmati onenhancementoftextclassificationandanalysisoftextemotionsusinggraphmachinelearningandensemblelearningmethodsonnonenglishdatasets
AT alirezamofidi onenhancementoftextclassificationandanalysisoftextemotionsusinggraphmachinelearningandensemblelearningmethodsonnonenglishdatasets
AT mostafaabbaszadeh onenhancementoftextclassificationandanalysisoftextemotionsusinggraphmachinelearningandensemblelearningmethodsonnonenglishdatasets