Interpreting neural network judgments via minimal, stable, and symbolic corrections
© 2018 Curran Associates Inc..All rights reserved. We present a new algorithm to generate minimal, stable, and symbolic corrections to an input that will cause a neural network with ReLU activations to change its output. We argue that such a correction is a useful way to provide feedback to a user w...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2021
|
Online Access: | https://hdl.handle.net/1721.1/137906 |
_version_ | 1826197579135713280 |
---|---|
author | Solar Lezama, Armando Singh, Rishabh Zhang, Xin |
author2 | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
author_facet | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Solar Lezama, Armando Singh, Rishabh Zhang, Xin |
author_sort | Solar Lezama, Armando |
collection | MIT |
description | © 2018 Curran Associates Inc..All rights reserved. We present a new algorithm to generate minimal, stable, and symbolic corrections to an input that will cause a neural network with ReLU activations to change its output. We argue that such a correction is a useful way to provide feedback to a user when the network's output is different from a desired output. Our algorithm generates such a correction by solving a series of linear constraint satisfaction problems. The technique is evaluated on three neural network models: one predicting whether an applicant will pay a mortgage, one predicting whether a first-order theorem can be proved efficiently by a solver using certain heuristics, and the final one judging whether a drawing is an accurate rendition of a canonical drawing of a cat. |
first_indexed | 2024-09-23T10:49:48Z |
format | Article |
id | mit-1721.1/137906 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T10:49:48Z |
publishDate | 2021 |
record_format | dspace |
spelling | mit-1721.1/1379062023-02-01T21:51:38Z Interpreting neural network judgments via minimal, stable, and symbolic corrections Solar Lezama, Armando Singh, Rishabh Zhang, Xin Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory © 2018 Curran Associates Inc..All rights reserved. We present a new algorithm to generate minimal, stable, and symbolic corrections to an input that will cause a neural network with ReLU activations to change its output. We argue that such a correction is a useful way to provide feedback to a user when the network's output is different from a desired output. Our algorithm generates such a correction by solving a series of linear constraint satisfaction problems. The technique is evaluated on three neural network models: one predicting whether an applicant will pay a mortgage, one predicting whether a first-order theorem can be proved efficiently by a solver using certain heuristics, and the final one judging whether a drawing is an accurate rendition of a canonical drawing of a cat. 2021-11-09T15:09:50Z 2021-11-09T15:09:50Z 2018 2019-07-10T13:22:05Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/137906 Solar Lezama, Armando, Singh, Rishabh and Zhang, Xin. 2018. "Interpreting neural network judgments via minimal, stable, and symbolic corrections." en https://papers.nips.cc/paper/7736-interpreting-neural-network-judgments-via-minimal-stable-and-symbolic-corrections Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Neural Information Processing Systems (NIPS) |
spellingShingle | Solar Lezama, Armando Singh, Rishabh Zhang, Xin Interpreting neural network judgments via minimal, stable, and symbolic corrections |
title | Interpreting neural network judgments via minimal, stable, and symbolic corrections |
title_full | Interpreting neural network judgments via minimal, stable, and symbolic corrections |
title_fullStr | Interpreting neural network judgments via minimal, stable, and symbolic corrections |
title_full_unstemmed | Interpreting neural network judgments via minimal, stable, and symbolic corrections |
title_short | Interpreting neural network judgments via minimal, stable, and symbolic corrections |
title_sort | interpreting neural network judgments via minimal stable and symbolic corrections |
url | https://hdl.handle.net/1721.1/137906 |
work_keys_str_mv | AT solarlezamaarmando interpretingneuralnetworkjudgmentsviaminimalstableandsymboliccorrections AT singhrishabh interpretingneuralnetworkjudgmentsviaminimalstableandsymboliccorrections AT zhangxin interpretingneuralnetworkjudgmentsviaminimalstableandsymboliccorrections |