Data Augmentation Methods for Enhancing Robustness in Text Classification Tasks
Text classification is widely studied in natural language processing (NLP). Deep learning models, including large pre-trained models like BERT and DistilBERT, have achieved impressive results in text classification tasks. However, these models’ robustness against adversarial attacks remains an area...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-01-01
|
Series: | Algorithms |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-4893/16/1/59 |
_version_ | 1797447026874515456 |
---|---|
author | Huidong Tang Sayaka Kamei Yasuhiko Morimoto |
author_facet | Huidong Tang Sayaka Kamei Yasuhiko Morimoto |
author_sort | Huidong Tang |
collection | DOAJ |
description | Text classification is widely studied in natural language processing (NLP). Deep learning models, including large pre-trained models like BERT and DistilBERT, have achieved impressive results in text classification tasks. However, these models’ robustness against adversarial attacks remains an area of concern. To address this concern, we propose three data augmentation methods to improve the robustness of such pre-trained models. We evaluated our methods on four text classification datasets by fine-tuning DistilBERT on the augmented datasets and exposing the resulting models to adversarial attacks to evaluate their robustness. In addition to enhancing the robustness, our proposed methods can improve the accuracy and F1-score on three datasets. We also conducted comparison experiments with two existing data augmentation methods. We found that one of our proposed methods demonstrates a similar improvement in terms of performance, but all demonstrate a superior robustness improvement. |
first_indexed | 2024-03-09T13:49:55Z |
format | Article |
id | doaj.art-37238516e2ff434b84d904a3e9a821b3 |
institution | Directory Open Access Journal |
issn | 1999-4893 |
language | English |
last_indexed | 2024-03-09T13:49:55Z |
publishDate | 2023-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Algorithms |
spelling | doaj.art-37238516e2ff434b84d904a3e9a821b32023-11-30T20:51:52ZengMDPI AGAlgorithms1999-48932023-01-011615910.3390/a16010059Data Augmentation Methods for Enhancing Robustness in Text Classification TasksHuidong Tang0Sayaka Kamei1Yasuhiko Morimoto2Graduate School of Advanced Science and Engineering, Hiroshima University, Kagamiyama 1-7-1, Higashi-Hiroshima 739-8521, JapanGraduate School of Advanced Science and Engineering, Hiroshima University, Kagamiyama 1-7-1, Higashi-Hiroshima 739-8521, JapanGraduate School of Advanced Science and Engineering, Hiroshima University, Kagamiyama 1-7-1, Higashi-Hiroshima 739-8521, JapanText classification is widely studied in natural language processing (NLP). Deep learning models, including large pre-trained models like BERT and DistilBERT, have achieved impressive results in text classification tasks. However, these models’ robustness against adversarial attacks remains an area of concern. To address this concern, we propose three data augmentation methods to improve the robustness of such pre-trained models. We evaluated our methods on four text classification datasets by fine-tuning DistilBERT on the augmented datasets and exposing the resulting models to adversarial attacks to evaluate their robustness. In addition to enhancing the robustness, our proposed methods can improve the accuracy and F1-score on three datasets. We also conducted comparison experiments with two existing data augmentation methods. We found that one of our proposed methods demonstrates a similar improvement in terms of performance, but all demonstrate a superior robustness improvement.https://www.mdpi.com/1999-4893/16/1/59artificial intelligencenatural language processingtext classificationdata augmentationrobustness improvement |
spellingShingle | Huidong Tang Sayaka Kamei Yasuhiko Morimoto Data Augmentation Methods for Enhancing Robustness in Text Classification Tasks Algorithms artificial intelligence natural language processing text classification data augmentation robustness improvement |
title | Data Augmentation Methods for Enhancing Robustness in Text Classification Tasks |
title_full | Data Augmentation Methods for Enhancing Robustness in Text Classification Tasks |
title_fullStr | Data Augmentation Methods for Enhancing Robustness in Text Classification Tasks |
title_full_unstemmed | Data Augmentation Methods for Enhancing Robustness in Text Classification Tasks |
title_short | Data Augmentation Methods for Enhancing Robustness in Text Classification Tasks |
title_sort | data augmentation methods for enhancing robustness in text classification tasks |
topic | artificial intelligence natural language processing text classification data augmentation robustness improvement |
url | https://www.mdpi.com/1999-4893/16/1/59 |
work_keys_str_mv | AT huidongtang dataaugmentationmethodsforenhancingrobustnessintextclassificationtasks AT sayakakamei dataaugmentationmethodsforenhancingrobustnessintextclassificationtasks AT yasuhikomorimoto dataaugmentationmethodsforenhancingrobustnessintextclassificationtasks |