Optimized Adversarial Example With Classification Score Pattern Vulnerability Removed
Neural networks provide excellent service on recognition tasks such as image recognition and speech recognition as well as for pattern analysis and other tasks in fields related to artificial intelligence. However, neural networks are vulnerable to adversarial examples. An adversarial example is a s...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9579036/ |
_version_ | 1811342171780939776 |
---|---|
author | Hyun Kwon Kyoungmin Ko Sunghwan Kim |
author_facet | Hyun Kwon Kyoungmin Ko Sunghwan Kim |
author_sort | Hyun Kwon |
collection | DOAJ |
description | Neural networks provide excellent service on recognition tasks such as image recognition and speech recognition as well as for pattern analysis and other tasks in fields related to artificial intelligence. However, neural networks are vulnerable to adversarial examples. An adversarial example is a sample that is designed to be misclassified by a target model, although it poses no problem for recognition by humans, that is created by applying a minimal perturbation to a legitimate sample. Because the perturbation applied to the legitimate sample to create an adversarial example is optimized, the classification score for the target class has the characteristic of being similar to that for the legitimate class. This regularity occurs because minimal perturbations are applied only until the classification score for the target class is slightly higher than that for the legitimate class. Given the existence of this regularity in the classification scores, it is easy to detect an optimized adversarial example by looking for this pattern. However, the existing methods for generating optimized adversarial examples do not consider their weakness of allowing detectability by recognizing the pattern in the classification scores. To address this weakness, we propose an optimized adversarial example generation method in which the weakness due to the classification score pattern is removed. In the proposed method, a minimal perturbation is applied to a legitimate sample such that the classification score for the legitimate class is less than that for some of the other classes, and an optimized adversarial example is created with the pattern vulnerability removed. The results show that using 500 iterations, the proposed method can generate an optimized adversarial example that has a 100% attack success rate, with distortions of 2.81 and 2.23 for MNIST and Fashion-MNIST, respectively. |
first_indexed | 2024-04-13T19:06:47Z |
format | Article |
id | doaj.art-3ddef268e7414cc09740b23be9f19310 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-13T19:06:47Z |
publishDate | 2022-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-3ddef268e7414cc09740b23be9f193102022-12-22T02:33:57ZengIEEEIEEE Access2169-35362022-01-0110358043581310.1109/ACCESS.2021.31104739579036Optimized Adversarial Example With Classification Score Pattern Vulnerability RemovedHyun Kwon0https://orcid.org/0000-0003-1169-9892Kyoungmin Ko1https://orcid.org/0000-0003-1666-6977Sunghwan Kim2Department of Artificial Intelligence and Data Science, Korea Military Academy, Seoul, South KoreaDepartment of Applied Statistics, Konkuk University, Seoul, South KoreaDepartment of Applied Statistics, Konkuk University, Seoul, South KoreaNeural networks provide excellent service on recognition tasks such as image recognition and speech recognition as well as for pattern analysis and other tasks in fields related to artificial intelligence. However, neural networks are vulnerable to adversarial examples. An adversarial example is a sample that is designed to be misclassified by a target model, although it poses no problem for recognition by humans, that is created by applying a minimal perturbation to a legitimate sample. Because the perturbation applied to the legitimate sample to create an adversarial example is optimized, the classification score for the target class has the characteristic of being similar to that for the legitimate class. This regularity occurs because minimal perturbations are applied only until the classification score for the target class is slightly higher than that for the legitimate class. Given the existence of this regularity in the classification scores, it is easy to detect an optimized adversarial example by looking for this pattern. However, the existing methods for generating optimized adversarial examples do not consider their weakness of allowing detectability by recognizing the pattern in the classification scores. To address this weakness, we propose an optimized adversarial example generation method in which the weakness due to the classification score pattern is removed. In the proposed method, a minimal perturbation is applied to a legitimate sample such that the classification score for the legitimate class is less than that for some of the other classes, and an optimized adversarial example is created with the pattern vulnerability removed. The results show that using 500 iterations, the proposed method can generate an optimized adversarial example that has a 100% attack success rate, with distortions of 2.81 and 2.23 for MNIST and Fashion-MNIST, respectively.https://ieeexplore.ieee.org/document/9579036/Neural networkevasion attackclassification scoreoptimization |
spellingShingle | Hyun Kwon Kyoungmin Ko Sunghwan Kim Optimized Adversarial Example With Classification Score Pattern Vulnerability Removed IEEE Access Neural network evasion attack classification score optimization |
title | Optimized Adversarial Example With Classification Score Pattern Vulnerability Removed |
title_full | Optimized Adversarial Example With Classification Score Pattern Vulnerability Removed |
title_fullStr | Optimized Adversarial Example With Classification Score Pattern Vulnerability Removed |
title_full_unstemmed | Optimized Adversarial Example With Classification Score Pattern Vulnerability Removed |
title_short | Optimized Adversarial Example With Classification Score Pattern Vulnerability Removed |
title_sort | optimized adversarial example with classification score pattern vulnerability removed |
topic | Neural network evasion attack classification score optimization |
url | https://ieeexplore.ieee.org/document/9579036/ |
work_keys_str_mv | AT hyunkwon optimizedadversarialexamplewithclassificationscorepatternvulnerabilityremoved AT kyoungminko optimizedadversarialexamplewithclassificationscorepatternvulnerabilityremoved AT sunghwankim optimizedadversarialexamplewithclassificationscorepatternvulnerabilityremoved |