CANARY: An Adversarial Robustness Evaluation Platform for Deep Learning Models on Image Classification

The vulnerability of deep-learning-based image classification models to erroneous conclusions in the presence of small perturbations crafted by attackers has prompted attention to the question of the models’ robustness level. However, the question of how to comprehensively and fairly measure the adv...

Full description

Bibliographic Details
Main Authors: Jiazheng Sun, Li Chen, Chenxiao Xia, Da Zhang, Rong Huang, Zhi Qiu, Wenqi Xiong, Jun Zheng, Yu-An Tan
Format: Article
Language:English
Published: MDPI AG 2023-08-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/12/17/3665
_version_ 1797582683527708672
author Jiazheng Sun
Li Chen
Chenxiao Xia
Da Zhang
Rong Huang
Zhi Qiu
Wenqi Xiong
Jun Zheng
Yu-An Tan
author_facet Jiazheng Sun
Li Chen
Chenxiao Xia
Da Zhang
Rong Huang
Zhi Qiu
Wenqi Xiong
Jun Zheng
Yu-An Tan
author_sort Jiazheng Sun
collection DOAJ
description The vulnerability of deep-learning-based image classification models to erroneous conclusions in the presence of small perturbations crafted by attackers has prompted attention to the question of the models’ robustness level. However, the question of how to comprehensively and fairly measure the adversarial robustness of models with different structures and defenses as well as the performance of different attack methods has never been accurately answered. In this work, we present the design, implementation, and evaluation of Canary, a platform that aims to answer this question. Canary uses a common scoring framework that includes 4 dimensions with 26 (sub)metrics for evaluation. First, Canary generates and selects valid adversarial examples and collects metrics data through a series of tests. Then it uses a two-way evaluation strategy to guide the data organization and finally integrates all the data to give the scores for model robustness and attack effectiveness. In this process, we use Item Response Theory (IRT) for the first time to ensure that all the metrics can be fairly calculated into a score that can visually measure the capability. In order to fully demonstrate the effectiveness of Canary, we conducted large-scale testing of 15 representative models trained on the ImageNet dataset using 12 white-box attacks and 12 black-box attacks and came up with a series of in-depth and interesting findings. This further illustrates the capabilities and strengths of Canary as a benchmarking platform. Our paper provides an open-source framework for model robustness evaluation, allowing researchers to perform comprehensive and rapid evaluations of models or attack/defense algorithms, thus inspiring further improvements and greatly benefiting future work.
first_indexed 2024-03-10T23:24:54Z
format Article
id doaj.art-a20c0eb100d947509e507917121a6006
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-10T23:24:54Z
publishDate 2023-08-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-a20c0eb100d947509e507917121a60062023-11-19T08:02:20ZengMDPI AGElectronics2079-92922023-08-011217366510.3390/electronics12173665CANARY: An Adversarial Robustness Evaluation Platform for Deep Learning Models on Image ClassificationJiazheng Sun0Li Chen1Chenxiao Xia2Da Zhang3Rong Huang4Zhi Qiu5Wenqi Xiong6Jun Zheng7Yu-An Tan8School of Cyberspace Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Cyberspace Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Cyberspace Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Cyberspace Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Cyberspace Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Cyberspace Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Cyberspace Science & Technology, Beijing Institute of Technology, Beijing 100081, ChinaThe vulnerability of deep-learning-based image classification models to erroneous conclusions in the presence of small perturbations crafted by attackers has prompted attention to the question of the models’ robustness level. However, the question of how to comprehensively and fairly measure the adversarial robustness of models with different structures and defenses as well as the performance of different attack methods has never been accurately answered. In this work, we present the design, implementation, and evaluation of Canary, a platform that aims to answer this question. Canary uses a common scoring framework that includes 4 dimensions with 26 (sub)metrics for evaluation. First, Canary generates and selects valid adversarial examples and collects metrics data through a series of tests. Then it uses a two-way evaluation strategy to guide the data organization and finally integrates all the data to give the scores for model robustness and attack effectiveness. In this process, we use Item Response Theory (IRT) for the first time to ensure that all the metrics can be fairly calculated into a score that can visually measure the capability. In order to fully demonstrate the effectiveness of Canary, we conducted large-scale testing of 15 representative models trained on the ImageNet dataset using 12 white-box attacks and 12 black-box attacks and came up with a series of in-depth and interesting findings. This further illustrates the capabilities and strengths of Canary as a benchmarking platform. Our paper provides an open-source framework for model robustness evaluation, allowing researchers to perform comprehensive and rapid evaluations of models or attack/defense algorithms, thus inspiring further improvements and greatly benefiting future work.https://www.mdpi.com/2079-9292/12/17/3665AI securityadversarial robustness evaluationadversarial attackdeep model
spellingShingle Jiazheng Sun
Li Chen
Chenxiao Xia
Da Zhang
Rong Huang
Zhi Qiu
Wenqi Xiong
Jun Zheng
Yu-An Tan
CANARY: An Adversarial Robustness Evaluation Platform for Deep Learning Models on Image Classification
Electronics
AI security
adversarial robustness evaluation
adversarial attack
deep model
title CANARY: An Adversarial Robustness Evaluation Platform for Deep Learning Models on Image Classification
title_full CANARY: An Adversarial Robustness Evaluation Platform for Deep Learning Models on Image Classification
title_fullStr CANARY: An Adversarial Robustness Evaluation Platform for Deep Learning Models on Image Classification
title_full_unstemmed CANARY: An Adversarial Robustness Evaluation Platform for Deep Learning Models on Image Classification
title_short CANARY: An Adversarial Robustness Evaluation Platform for Deep Learning Models on Image Classification
title_sort canary an adversarial robustness evaluation platform for deep learning models on image classification
topic AI security
adversarial robustness evaluation
adversarial attack
deep model
url https://www.mdpi.com/2079-9292/12/17/3665
work_keys_str_mv AT jiazhengsun canaryanadversarialrobustnessevaluationplatformfordeeplearningmodelsonimageclassification
AT lichen canaryanadversarialrobustnessevaluationplatformfordeeplearningmodelsonimageclassification
AT chenxiaoxia canaryanadversarialrobustnessevaluationplatformfordeeplearningmodelsonimageclassification
AT dazhang canaryanadversarialrobustnessevaluationplatformfordeeplearningmodelsonimageclassification
AT ronghuang canaryanadversarialrobustnessevaluationplatformfordeeplearningmodelsonimageclassification
AT zhiqiu canaryanadversarialrobustnessevaluationplatformfordeeplearningmodelsonimageclassification
AT wenqixiong canaryanadversarialrobustnessevaluationplatformfordeeplearningmodelsonimageclassification
AT junzheng canaryanadversarialrobustnessevaluationplatformfordeeplearningmodelsonimageclassification
AT yuantan canaryanadversarialrobustnessevaluationplatformfordeeplearningmodelsonimageclassification