Interpreting Deep Visual Representations via Network Dissection

The success of recent deep convolutional neural networks (CNNs) depends on learning hidden representations that can summarize the important factors of variation behind the data. In this work, we describe Network Dissection, a method that interprets networks by providing meaningful labels to their in...

Full description

Bibliographic Details
Main Authors: Zhou, Bolei, Bau, David, Oliva, Aude, Torralba, Antonio
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers 2019
Subjects:
Online Access:https://hdl.handle.net/1721.1/122817
_version_ 1826210275788849152
author Zhou, Bolei
Bau, David
Oliva, Aude
Torralba, Antonio
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Zhou, Bolei
Bau, David
Oliva, Aude
Torralba, Antonio
author_sort Zhou, Bolei
collection MIT
description The success of recent deep convolutional neural networks (CNNs) depends on learning hidden representations that can summarize the important factors of variation behind the data. In this work, we describe Network Dissection, a method that interprets networks by providing meaningful labels to their individual units. The proposed method quantifies the interpretability of CNN representations by evaluating the alignment between individual hidden units and visual semantic concepts. By identifying the best alignments, units are given interpretable labels ranging from colors, materials, textures, parts, objects and scenes. The method reveals that deep representations are more transparent and interpretable than they would be under a random equivalently powerful basis. We apply our approach to interpret and compare the latent representations of several network architectures trained to solve a wide range of supervised and self-supervised tasks. We then examine factors affecting the network interpretability such as the number of the training iterations, regularizations, different initialization parameters, as well as networks depth and width. Finally we show that the interpreted units can be used to provide explicit explanations of a given CNN prediction for an image. Our results highlight that interpretability is an important property of deep neural networks that provides new insights into what hierarchical structures can learn. Keywords: Convolutional neural networks; Network interpretability; Visual recognition; Interpretable machine learning; Visualization; Detectors; Training; Image color analysis; Task analysis; Image segmentation; Semantics
first_indexed 2024-09-23T14:47:13Z
format Article
id mit-1721.1/122817
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T14:47:13Z
publishDate 2019
publisher Institute of Electrical and Electronics Engineers
record_format dspace
spelling mit-1721.1/1228172022-10-01T22:27:43Z Interpreting Deep Visual Representations via Network Dissection Zhou, Bolei Bau, David Oliva, Aude Torralba, Antonio Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Computational Theory and Mathematics Software Applied Mathematics Artificial Intelligence Computer Vision and Pattern Recognition The success of recent deep convolutional neural networks (CNNs) depends on learning hidden representations that can summarize the important factors of variation behind the data. In this work, we describe Network Dissection, a method that interprets networks by providing meaningful labels to their individual units. The proposed method quantifies the interpretability of CNN representations by evaluating the alignment between individual hidden units and visual semantic concepts. By identifying the best alignments, units are given interpretable labels ranging from colors, materials, textures, parts, objects and scenes. The method reveals that deep representations are more transparent and interpretable than they would be under a random equivalently powerful basis. We apply our approach to interpret and compare the latent representations of several network architectures trained to solve a wide range of supervised and self-supervised tasks. We then examine factors affecting the network interpretability such as the number of the training iterations, regularizations, different initialization parameters, as well as networks depth and width. Finally we show that the interpreted units can be used to provide explicit explanations of a given CNN prediction for an image. Our results highlight that interpretability is an important property of deep neural networks that provides new insights into what hierarchical structures can learn. Keywords: Convolutional neural networks; Network interpretability; Visual recognition; Interpretable machine learning; Visualization; Detectors; Training; Image color analysis; Task analysis; Image segmentation; Semantics United States. Defense Advanced Research Projects Agency (FA8750-18-C-0004) National Science Foundation (U.S.)(Grant 1524817) National Science Foundation (U.S.)(Grant 1532591) United States. Office of Naval Research (Grant N00014-16-1-3116) Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Google (Firm) Amazon.com (Firm) NVIDIA Corporation Facebook (Firm) 2019-11-11T18:43:22Z 2019-11-11T18:43:22Z 2019-09-01 2019-07-11T17:07:04Z Article http://purl.org/eprint/type/JournalArticle 0162-8828 2160-9292 1939-3539 https://hdl.handle.net/1721.1/122817 Zhou, Bolei et. al. "Interpreting Deep Visual Representations via Network Dissection." Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 9 (September 2019): pp. 2131-2145 © Institute of Electrical and Electronics Engineers 2019 en http://dx.doi.org/10.1109/tpami.2018.2858759 IEEE Transactions on Pattern Analysis and Machine Intelligence Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Institute of Electrical and Electronics Engineers arXiv
spellingShingle Computational Theory and Mathematics
Software
Applied Mathematics
Artificial Intelligence
Computer Vision and Pattern Recognition
Zhou, Bolei
Bau, David
Oliva, Aude
Torralba, Antonio
Interpreting Deep Visual Representations via Network Dissection
title Interpreting Deep Visual Representations via Network Dissection
title_full Interpreting Deep Visual Representations via Network Dissection
title_fullStr Interpreting Deep Visual Representations via Network Dissection
title_full_unstemmed Interpreting Deep Visual Representations via Network Dissection
title_short Interpreting Deep Visual Representations via Network Dissection
title_sort interpreting deep visual representations via network dissection
topic Computational Theory and Mathematics
Software
Applied Mathematics
Artificial Intelligence
Computer Vision and Pattern Recognition
url https://hdl.handle.net/1721.1/122817
work_keys_str_mv AT zhoubolei interpretingdeepvisualrepresentationsvianetworkdissection
AT baudavid interpretingdeepvisualrepresentationsvianetworkdissection
AT olivaaude interpretingdeepvisualrepresentationsvianetworkdissection
AT torralbaantonio interpretingdeepvisualrepresentationsvianetworkdissection