DNN Intellectual Property Extraction Using Composite Data

As state-of-the-art deep neural networks are being deployed at the core level of increasingly large numbers of AI-based products and services, the incentive for “copying them” (i.e., their intellectual property, manifested through the knowledge that is encapsulated in them) either by adversaries or...

Full description

Bibliographic Details
Main Authors:	Itay Mosafi, Eli (Omid) David, Yaniv Altshuler, Nathan S. Netanyahu
Format:	Article
Language:	English
Published:	MDPI AG 2022-02-01
Series:	Entropy
Subjects:	deep learning cybersecurity artificial intelligence swarm intelligence adversarial AI information theory
Online Access:	https://www.mdpi.com/1099-4300/24/3/349

_version_	1797471704095653888
author	Itay Mosafi Eli (Omid) David Yaniv Altshuler Nathan S. Netanyahu
author_facet	Itay Mosafi Eli (Omid) David Yaniv Altshuler Nathan S. Netanyahu
author_sort	Itay Mosafi
collection	DOAJ
description	As state-of-the-art deep neural networks are being deployed at the core level of increasingly large numbers of AI-based products and services, the incentive for “copying them” (i.e., their intellectual property, manifested through the knowledge that is encapsulated in them) either by adversaries or commercial competitors is expected to considerably increase over time. The most efficient way to extract or steal knowledge from such networks is by querying them using a large dataset of random samples and recording their output, which is followed by the training of a <i>student</i> network, aiming to eventually mimic these outputs, without making any assumption about the original networks. The most effective way to protect against such a mimicking attack is to answer queries with the classification result only, omitting confidence values associated with the softmax layer. In this paper, we present a novel method for generating composite images for attacking a <i>mentor</i> neural network using a student model. Our method assumes no information regarding the mentor’s training dataset, architecture, or weights. Furthermore, assuming no information regarding the mentor’s softmax output values, our method successfully mimics the given neural network and is capable of stealing large portions (and sometimes all) of its encapsulated knowledge. Our student model achieved 99% relative accuracy to the protected mentor model on the Cifar-10 test set. In addition, we demonstrate that our student network (which copies the mentor) is impervious to watermarking protection methods and thus would evade being detected as a stolen model by existing dedicated techniques. Our results imply that all current neural networks are vulnerable to mimicking attacks, even if they do not divulge anything but the most basic required output, and that the student model that mimics them cannot be easily detected using currently available techniques.
first_indexed	2024-03-09T19:51:59Z
format	Article
id	doaj.art-31e524d6a75e4b0a930395a8e040c5c9
institution	Directory Open Access Journal
issn	1099-4300
language	English
last_indexed	2024-03-09T19:51:59Z
publishDate	2022-02-01
publisher	MDPI AG
record_format	Article
series	Entropy
spelling	doaj.art-31e524d6a75e4b0a930395a8e040c5c92023-11-24T01:07:07ZengMDPI AGEntropy1099-43002022-02-0124334910.3390/e24030349DNN Intellectual Property Extraction Using Composite DataItay Mosafi0Eli (Omid) David1Yaniv Altshuler2Nathan S. Netanyahu3Department of Computer Science, Bar-Ilan University, Ramat-Gan 5290002, IsraelDepartment of Computer Science, Bar-Ilan University, Ramat-Gan 5290002, IsraelMIT Media Lab, 77 Mass. Ave., E14/E15, Cambridge, MA 02139-4307, USADepartment of Computer Science, Bar-Ilan University, Ramat-Gan 5290002, IsraelAs state-of-the-art deep neural networks are being deployed at the core level of increasingly large numbers of AI-based products and services, the incentive for “copying them” (i.e., their intellectual property, manifested through the knowledge that is encapsulated in them) either by adversaries or commercial competitors is expected to considerably increase over time. The most efficient way to extract or steal knowledge from such networks is by querying them using a large dataset of random samples and recording their output, which is followed by the training of a <i>student</i> network, aiming to eventually mimic these outputs, without making any assumption about the original networks. The most effective way to protect against such a mimicking attack is to answer queries with the classification result only, omitting confidence values associated with the softmax layer. In this paper, we present a novel method for generating composite images for attacking a <i>mentor</i> neural network using a student model. Our method assumes no information regarding the mentor’s training dataset, architecture, or weights. Furthermore, assuming no information regarding the mentor’s softmax output values, our method successfully mimics the given neural network and is capable of stealing large portions (and sometimes all) of its encapsulated knowledge. Our student model achieved 99% relative accuracy to the protected mentor model on the Cifar-10 test set. In addition, we demonstrate that our student network (which copies the mentor) is impervious to watermarking protection methods and thus would evade being detected as a stolen model by existing dedicated techniques. Our results imply that all current neural networks are vulnerable to mimicking attacks, even if they do not divulge anything but the most basic required output, and that the student model that mimics them cannot be easily detected using currently available techniques.https://www.mdpi.com/1099-4300/24/3/349deep learningcybersecurityartificial intelligenceswarm intelligenceadversarial AIinformation theory
spellingShingle	Itay Mosafi Eli (Omid) David Yaniv Altshuler Nathan S. Netanyahu DNN Intellectual Property Extraction Using Composite Data Entropy deep learning cybersecurity artificial intelligence swarm intelligence adversarial AI information theory
title	DNN Intellectual Property Extraction Using Composite Data
title_full	DNN Intellectual Property Extraction Using Composite Data
title_fullStr	DNN Intellectual Property Extraction Using Composite Data
title_full_unstemmed	DNN Intellectual Property Extraction Using Composite Data
title_short	DNN Intellectual Property Extraction Using Composite Data
title_sort	dnn intellectual property extraction using composite data
topic	deep learning cybersecurity artificial intelligence swarm intelligence adversarial AI information theory
url	https://www.mdpi.com/1099-4300/24/3/349
work_keys_str_mv	AT itaymosafi dnnintellectualpropertyextractionusingcompositedata AT eliomiddavid dnnintellectualpropertyextractionusingcompositedata AT yanivaltshuler dnnintellectualpropertyextractionusingcompositedata AT nathansnetanyahu dnnintellectualpropertyextractionusingcompositedata

DNN Intellectual Property Extraction Using Composite Data

Similar Items