Certifiers make neural networks vulnerable to availability attacks

To achieve reliable, robust, and safe AI systems, it is vital to implement fallback strategies when AI predictions cannot be trusted. Certifiers for neural networks are a reliable way to check the robustness of these predictions. They guarantee for some predictions that a certain class of manipulati...

Full description

Bibliographic Details
Main Authors:	Lorenz, T, Kwiatkowska, M, Fritz, M
Format:	Conference item
Language:	English
Published:	Association for Computing Machinery 2023

_version_	1826316792193089536
author	Lorenz, T Kwiatkowska, M Fritz, M
author_facet	Lorenz, T Kwiatkowska, M Fritz, M
author_sort	Lorenz, T
collection	OXFORD
description	To achieve reliable, robust, and safe AI systems, it is vital to implement fallback strategies when AI predictions cannot be trusted. Certifiers for neural networks are a reliable way to check the robustness of these predictions. They guarantee for some predictions that a certain class of manipulations or attacks could not have changed the outcome. For the remaining predictions without guarantees, the method abstains from making a prediction, and a fallback strategy needs to be invoked, which typically incurs additional costs, can require a human operator, or even fail to provide any prediction. While this is a key concept towards safe and secure AI, we show for the first time that this approach comes with its own security risks, as such fallback strategies can be deliberately triggered by an adversary. In addition to naturally occurring abstains for some inputs and perturbations, the adversary can use training-time attacks to deliberately trigger the fallback with high probability. This transfers the main system load onto the fallback, reducing the overall system's integrity and/or availability. We design two novel availability attacks, which show the practical relevance of these threats. For example, adding 1% poisoned data during training is sufficient to trigger the fallback and hence make the model unavailable for up to 100% of all inputs by inserting the trigger. Our extensive experiments across multiple datasets, model architectures, and certifiers demonstrate the broad applicability of these attacks. An initial investigation into potential defenses shows that current approaches are insufficient to mitigate the issue, highlighting the need for new, specific solutions.
first_indexed	2024-03-07T08:00:33Z
format	Conference item
id	oxford-uuid:79abdbde-0dc2-49d4-8c66-dd27c7b45e29
institution	University of Oxford
language	English
last_indexed	2025-02-19T04:28:25Z
publishDate	2023
publisher	Association for Computing Machinery
record_format	dspace
spelling	oxford-uuid:79abdbde-0dc2-49d4-8c66-dd27c7b45e292024-12-09T20:03:49ZCertifiers make neural networks vulnerable to availability attacksConference itemhttp://purl.org/coar/resource_type/c_5794uuid:79abdbde-0dc2-49d4-8c66-dd27c7b45e29EnglishSymplectic ElementsAssociation for Computing Machinery2023Lorenz, TKwiatkowska, MFritz, MTo achieve reliable, robust, and safe AI systems, it is vital to implement fallback strategies when AI predictions cannot be trusted. Certifiers for neural networks are a reliable way to check the robustness of these predictions. They guarantee for some predictions that a certain class of manipulations or attacks could not have changed the outcome. For the remaining predictions without guarantees, the method abstains from making a prediction, and a fallback strategy needs to be invoked, which typically incurs additional costs, can require a human operator, or even fail to provide any prediction. While this is a key concept towards safe and secure AI, we show for the first time that this approach comes with its own security risks, as such fallback strategies can be deliberately triggered by an adversary. In addition to naturally occurring abstains for some inputs and perturbations, the adversary can use training-time attacks to deliberately trigger the fallback with high probability. This transfers the main system load onto the fallback, reducing the overall system's integrity and/or availability. We design two novel availability attacks, which show the practical relevance of these threats. For example, adding 1% poisoned data during training is sufficient to trigger the fallback and hence make the model unavailable for up to 100% of all inputs by inserting the trigger. Our extensive experiments across multiple datasets, model architectures, and certifiers demonstrate the broad applicability of these attacks. An initial investigation into potential defenses shows that current approaches are insufficient to mitigate the issue, highlighting the need for new, specific solutions.
spellingShingle	Lorenz, T Kwiatkowska, M Fritz, M Certifiers make neural networks vulnerable to availability attacks
title	Certifiers make neural networks vulnerable to availability attacks
title_full	Certifiers make neural networks vulnerable to availability attacks
title_fullStr	Certifiers make neural networks vulnerable to availability attacks
title_full_unstemmed	Certifiers make neural networks vulnerable to availability attacks
title_short	Certifiers make neural networks vulnerable to availability attacks
title_sort	certifiers make neural networks vulnerable to availability attacks
work_keys_str_mv	AT lorenzt certifiersmakeneuralnetworksvulnerabletoavailabilityattacks AT kwiatkowskam certifiersmakeneuralnetworksvulnerabletoavailabilityattacks AT fritzm certifiersmakeneuralnetworksvulnerabletoavailabilityattacks

Certifiers make neural networks vulnerable to availability attacks

Similar Items