Optimized Adversarial Example With Classification Score Pattern Vulnerability Removed

Neural networks provide excellent service on recognition tasks such as image recognition and speech recognition as well as for pattern analysis and other tasks in fields related to artificial intelligence. However, neural networks are vulnerable to adversarial examples. An adversarial example is a s...

Full description

Bibliographic Details
Main Authors: Hyun Kwon, Kyoungmin Ko, Sunghwan Kim
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9579036/
_version_ 1811342171780939776
author Hyun Kwon
Kyoungmin Ko
Sunghwan Kim
author_facet Hyun Kwon
Kyoungmin Ko
Sunghwan Kim
author_sort Hyun Kwon
collection DOAJ
description Neural networks provide excellent service on recognition tasks such as image recognition and speech recognition as well as for pattern analysis and other tasks in fields related to artificial intelligence. However, neural networks are vulnerable to adversarial examples. An adversarial example is a sample that is designed to be misclassified by a target model, although it poses no problem for recognition by humans, that is created by applying a minimal perturbation to a legitimate sample. Because the perturbation applied to the legitimate sample to create an adversarial example is optimized, the classification score for the target class has the characteristic of being similar to that for the legitimate class. This regularity occurs because minimal perturbations are applied only until the classification score for the target class is slightly higher than that for the legitimate class. Given the existence of this regularity in the classification scores, it is easy to detect an optimized adversarial example by looking for this pattern. However, the existing methods for generating optimized adversarial examples do not consider their weakness of allowing detectability by recognizing the pattern in the classification scores. To address this weakness, we propose an optimized adversarial example generation method in which the weakness due to the classification score pattern is removed. In the proposed method, a minimal perturbation is applied to a legitimate sample such that the classification score for the legitimate class is less than that for some of the other classes, and an optimized adversarial example is created with the pattern vulnerability removed. The results show that using 500 iterations, the proposed method can generate an optimized adversarial example that has a 100% attack success rate, with distortions of 2.81 and 2.23 for MNIST and Fashion-MNIST, respectively.
first_indexed 2024-04-13T19:06:47Z
format Article
id doaj.art-3ddef268e7414cc09740b23be9f19310
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-13T19:06:47Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-3ddef268e7414cc09740b23be9f193102022-12-22T02:33:57ZengIEEEIEEE Access2169-35362022-01-0110358043581310.1109/ACCESS.2021.31104739579036Optimized Adversarial Example With Classification Score Pattern Vulnerability RemovedHyun Kwon0https://orcid.org/0000-0003-1169-9892Kyoungmin Ko1https://orcid.org/0000-0003-1666-6977Sunghwan Kim2Department of Artificial Intelligence and Data Science, Korea Military Academy, Seoul, South KoreaDepartment of Applied Statistics, Konkuk University, Seoul, South KoreaDepartment of Applied Statistics, Konkuk University, Seoul, South KoreaNeural networks provide excellent service on recognition tasks such as image recognition and speech recognition as well as for pattern analysis and other tasks in fields related to artificial intelligence. However, neural networks are vulnerable to adversarial examples. An adversarial example is a sample that is designed to be misclassified by a target model, although it poses no problem for recognition by humans, that is created by applying a minimal perturbation to a legitimate sample. Because the perturbation applied to the legitimate sample to create an adversarial example is optimized, the classification score for the target class has the characteristic of being similar to that for the legitimate class. This regularity occurs because minimal perturbations are applied only until the classification score for the target class is slightly higher than that for the legitimate class. Given the existence of this regularity in the classification scores, it is easy to detect an optimized adversarial example by looking for this pattern. However, the existing methods for generating optimized adversarial examples do not consider their weakness of allowing detectability by recognizing the pattern in the classification scores. To address this weakness, we propose an optimized adversarial example generation method in which the weakness due to the classification score pattern is removed. In the proposed method, a minimal perturbation is applied to a legitimate sample such that the classification score for the legitimate class is less than that for some of the other classes, and an optimized adversarial example is created with the pattern vulnerability removed. The results show that using 500 iterations, the proposed method can generate an optimized adversarial example that has a 100% attack success rate, with distortions of 2.81 and 2.23 for MNIST and Fashion-MNIST, respectively.https://ieeexplore.ieee.org/document/9579036/Neural networkevasion attackclassification scoreoptimization
spellingShingle Hyun Kwon
Kyoungmin Ko
Sunghwan Kim
Optimized Adversarial Example With Classification Score Pattern Vulnerability Removed
IEEE Access
Neural network
evasion attack
classification score
optimization
title Optimized Adversarial Example With Classification Score Pattern Vulnerability Removed
title_full Optimized Adversarial Example With Classification Score Pattern Vulnerability Removed
title_fullStr Optimized Adversarial Example With Classification Score Pattern Vulnerability Removed
title_full_unstemmed Optimized Adversarial Example With Classification Score Pattern Vulnerability Removed
title_short Optimized Adversarial Example With Classification Score Pattern Vulnerability Removed
title_sort optimized adversarial example with classification score pattern vulnerability removed
topic Neural network
evasion attack
classification score
optimization
url https://ieeexplore.ieee.org/document/9579036/
work_keys_str_mv AT hyunkwon optimizedadversarialexamplewithclassificationscorepatternvulnerabilityremoved
AT kyoungminko optimizedadversarialexamplewithclassificationscorepatternvulnerabilityremoved
AT sunghwankim optimizedadversarialexamplewithclassificationscorepatternvulnerabilityremoved