Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification

Deep neural networks (DNNs) have been known to be vulnerable to adversarial attacks. Adversarial training (AT) is, so far, the only method that can guarantee the robustness of DNNs to adversarial attacks. However, the robustness generalization accuracy gain of AT is still far lower than the standard...

Full description

Bibliographic Details
Main Authors:	Desheng Wang, Weidong Jin, Yunpu Wu
Format:	Article
Language:	English
Published:	MDPI AG 2023-03-01
Series:	Sensors
Subjects:	adversarial training between-class learning robustness regularization
Online Access:	https://www.mdpi.com/1424-8220/23/6/3252

_version_	1827747626254073856
author	Desheng Wang Weidong Jin Yunpu Wu
author_facet	Desheng Wang Weidong Jin Yunpu Wu
author_sort	Desheng Wang
collection	DOAJ
description	Deep neural networks (DNNs) have been known to be vulnerable to adversarial attacks. Adversarial training (AT) is, so far, the only method that can guarantee the robustness of DNNs to adversarial attacks. However, the robustness generalization accuracy gain of AT is still far lower than the standard generalization accuracy of an undefended model, and there is known to be a trade-off between the standard generalization accuracy and the robustness generalization accuracy of an adversarially trained model. In order to improve the robustness generalization and the standard generalization performance trade-off of AT, we propose a novel defense algorithm called Between-Class Adversarial Training (BCAT) that combines Between-Class learning (BC-learning) with standard AT. Specifically, BCAT mixes two adversarial examples from different classes and uses the mixed between-class adversarial examples to train a model instead of original adversarial examples during AT. We further propose BCAT+ which adopts a more powerful mixing method. BCAT and BCAT+ impose effective regularization on the feature distribution of adversarial examples to enlarge between-class distance, thus improving the robustness generalization and the standard generalization performance of AT. The proposed algorithms do not introduce any hyperparameters into standard AT; therefore, the process of hyperparameters searching can be avoided. We evaluate the proposed algorithms under both white-box attacks and black-box attacks using a spectrum of perturbation values on CIFAR-10, CIFAR-100, and SVHN datasets. The research findings indicate that our algorithms achieve better global robustness generalization performance than the state-of-the-art adversarial defense methods.
first_indexed	2024-03-11T05:55:26Z
format	Article
id	doaj.art-86558422308a46baa95f5f3f2f56691e
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-11T05:55:26Z
publishDate	2023-03-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-86558422308a46baa95f5f3f2f56691e2023-11-17T13:48:05ZengMDPI AGSensors1424-82202023-03-01236325210.3390/s23063252Between-Class Adversarial Training for Improving Adversarial Robustness of Image ClassificationDesheng Wang0Weidong Jin1Yunpu Wu2School of Electrical Engineering, Southwest Jiaotong University, Chengdu 611756, ChinaSchool of Electrical Engineering, Southwest Jiaotong University, Chengdu 611756, ChinaSchool of Electrical Engineering and Electronic Information, Xihua University, Chengdu 610039, ChinaDeep neural networks (DNNs) have been known to be vulnerable to adversarial attacks. Adversarial training (AT) is, so far, the only method that can guarantee the robustness of DNNs to adversarial attacks. However, the robustness generalization accuracy gain of AT is still far lower than the standard generalization accuracy of an undefended model, and there is known to be a trade-off between the standard generalization accuracy and the robustness generalization accuracy of an adversarially trained model. In order to improve the robustness generalization and the standard generalization performance trade-off of AT, we propose a novel defense algorithm called Between-Class Adversarial Training (BCAT) that combines Between-Class learning (BC-learning) with standard AT. Specifically, BCAT mixes two adversarial examples from different classes and uses the mixed between-class adversarial examples to train a model instead of original adversarial examples during AT. We further propose BCAT+ which adopts a more powerful mixing method. BCAT and BCAT+ impose effective regularization on the feature distribution of adversarial examples to enlarge between-class distance, thus improving the robustness generalization and the standard generalization performance of AT. The proposed algorithms do not introduce any hyperparameters into standard AT; therefore, the process of hyperparameters searching can be avoided. We evaluate the proposed algorithms under both white-box attacks and black-box attacks using a spectrum of perturbation values on CIFAR-10, CIFAR-100, and SVHN datasets. The research findings indicate that our algorithms achieve better global robustness generalization performance than the state-of-the-art adversarial defense methods.https://www.mdpi.com/1424-8220/23/6/3252adversarial trainingbetween-class learningrobustnessregularization
spellingShingle	Desheng Wang Weidong Jin Yunpu Wu Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification Sensors adversarial training between-class learning robustness regularization
title	Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification
title_full	Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification
title_fullStr	Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification
title_full_unstemmed	Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification
title_short	Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification
title_sort	between class adversarial training for improving adversarial robustness of image classification
topic	adversarial training between-class learning robustness regularization
url	https://www.mdpi.com/1424-8220/23/6/3252
work_keys_str_mv	AT deshengwang betweenclassadversarialtrainingforimprovingadversarialrobustnessofimageclassification AT weidongjin betweenclassadversarialtrainingforimprovingadversarialrobustnessofimageclassification AT yunpuwu betweenclassadversarialtrainingforimprovingadversarialrobustnessofimageclassification

Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification

Similar Items