On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English Datasets
In recent years, machine learning approaches, in particular graph learning methods, have achieved great results in the field of natural language processing, in particular text classification tasks. However, many of such models have shown limited generalization on datasets in different languages. In...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-10-01
|
Series: | Algorithms |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-4893/16/10/470 |
_version_ | 1797575018687758336 |
---|---|
author | Fatemeh Gholami Zahed Rahmati Alireza Mofidi Mostafa Abbaszadeh |
author_facet | Fatemeh Gholami Zahed Rahmati Alireza Mofidi Mostafa Abbaszadeh |
author_sort | Fatemeh Gholami |
collection | DOAJ |
description | In recent years, machine learning approaches, in particular graph learning methods, have achieved great results in the field of natural language processing, in particular text classification tasks. However, many of such models have shown limited generalization on datasets in different languages. In this research, we investigate and elaborate graph machine learning methods on non-English datasets (such as the Persian Digikala dataset), which consists of users’ opinions for the task of text classification. More specifically, we investigate different combinations of (Pars) BERT with various graph neural network (GNN) architectures (such as GCN, GAT, and GIN) as well as use ensemble learning methods in order to tackle the text classification task on certain well-known non-English datasets. Our analysis and results demonstrate how applying GNN models helps in achieving good scores on the task of text classification by better capturing the topological information between textual data. Additionally, our experiments show how models employing language-specific pre-trained models (like ParsBERT, instead of BERT) capture better information about the data, resulting in better accuracies. |
first_indexed | 2024-03-10T21:30:24Z |
format | Article |
id | doaj.art-75da903f6eb54fad85b81a4c0f2f7620 |
institution | Directory Open Access Journal |
issn | 1999-4893 |
language | English |
last_indexed | 2024-03-10T21:30:24Z |
publishDate | 2023-10-01 |
publisher | MDPI AG |
record_format | Article |
series | Algorithms |
spelling | doaj.art-75da903f6eb54fad85b81a4c0f2f76202023-11-19T15:23:37ZengMDPI AGAlgorithms1999-48932023-10-01161047010.3390/a16100470On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English DatasetsFatemeh Gholami0Zahed Rahmati1Alireza Mofidi2Mostafa Abbaszadeh3Department of Mathematics and Computer Science, Amirkabir University of Technology (Tehran Polytechnic), Tehran 15916-39675, IranDepartment of Mathematics and Computer Science, Amirkabir University of Technology (Tehran Polytechnic), Tehran 15916-39675, IranDepartment of Mathematics and Computer Science, Amirkabir University of Technology (Tehran Polytechnic), Tehran 15916-39675, IranDepartment of Mathematics and Computer Science, Amirkabir University of Technology (Tehran Polytechnic), Tehran 15916-39675, IranIn recent years, machine learning approaches, in particular graph learning methods, have achieved great results in the field of natural language processing, in particular text classification tasks. However, many of such models have shown limited generalization on datasets in different languages. In this research, we investigate and elaborate graph machine learning methods on non-English datasets (such as the Persian Digikala dataset), which consists of users’ opinions for the task of text classification. More specifically, we investigate different combinations of (Pars) BERT with various graph neural network (GNN) architectures (such as GCN, GAT, and GIN) as well as use ensemble learning methods in order to tackle the text classification task on certain well-known non-English datasets. Our analysis and results demonstrate how applying GNN models helps in achieving good scores on the task of text classification by better capturing the topological information between textual data. Additionally, our experiments show how models employing language-specific pre-trained models (like ParsBERT, instead of BERT) capture better information about the data, resulting in better accuracies.https://www.mdpi.com/1999-4893/16/10/470non-English text classificationgraph machine learningensemble learning method(Pars) BERT |
spellingShingle | Fatemeh Gholami Zahed Rahmati Alireza Mofidi Mostafa Abbaszadeh On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English Datasets Algorithms non-English text classification graph machine learning ensemble learning method (Pars) BERT |
title | On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English Datasets |
title_full | On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English Datasets |
title_fullStr | On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English Datasets |
title_full_unstemmed | On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English Datasets |
title_short | On Enhancement of Text Classification and Analysis of Text Emotions Using Graph Machine Learning and Ensemble Learning Methods on Non-English Datasets |
title_sort | on enhancement of text classification and analysis of text emotions using graph machine learning and ensemble learning methods on non english datasets |
topic | non-English text classification graph machine learning ensemble learning method (Pars) BERT |
url | https://www.mdpi.com/1999-4893/16/10/470 |
work_keys_str_mv | AT fatemehgholami onenhancementoftextclassificationandanalysisoftextemotionsusinggraphmachinelearningandensemblelearningmethodsonnonenglishdatasets AT zahedrahmati onenhancementoftextclassificationandanalysisoftextemotionsusinggraphmachinelearningandensemblelearningmethodsonnonenglishdatasets AT alirezamofidi onenhancementoftextclassificationandanalysisoftextemotionsusinggraphmachinelearningandensemblelearningmethodsonnonenglishdatasets AT mostafaabbaszadeh onenhancementoftextclassificationandanalysisoftextemotionsusinggraphmachinelearningandensemblelearningmethodsonnonenglishdatasets |