Teacher Guided Architecture Search

© 2019 IEEE. Much of the recent improvement in neural networks for computer vision has resulted from discovery of new networks architectures. Most prior work has used the performance of candidate models following limited training to automatically guide the search in a feasible way. Could further gai...

Full description

Bibliographic Details
Main Authors: Bashivan, Pouya, Tensen, Mark, Dicarlo, James
Other Authors: Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences
Format: Article
Language:English
Published: IEEE 2021
Online Access:https://hdl.handle.net/1721.1/135814
_version_ 1826213727245959168
author Bashivan, Pouya
Tensen, Mark
Dicarlo, James
author2 Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences
author_facet Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences
Bashivan, Pouya
Tensen, Mark
Dicarlo, James
author_sort Bashivan, Pouya
collection MIT
description © 2019 IEEE. Much of the recent improvement in neural networks for computer vision has resulted from discovery of new networks architectures. Most prior work has used the performance of candidate models following limited training to automatically guide the search in a feasible way. Could further gains in computational efficiency be achieved by guiding the search via measurements of a high performing network with unknown detailed architecture (e.g. the primate visual system)? As one step toward this goal, we use representational similarity analysis to evaluate the similarity of internal activations of candidate networks with those of a (fixed, high performing) teacher network. We show that adopting this evaluation metric could produce up to an order of magnitude in search efficiency over performance-guided methods. Our approach finds a convolutional cell structure with similar performance as was previously found using other methods but at a total computational cost that is two orders of magnitude lower than Neural Architecture Search (NAS) and more than four times lower than progressive neural architecture search (PNAS). We further show that measurements from only ∼300 neurons from primate visual system provides enough signal to find a network with an Imagenet top-1 error that is significantly lower than that achieved by performance-guided architecture search alone. These results suggest that representational matching can be used to accelerate network architecture search in cases where one has access to some or all of the internal representations of a teacher network of interest, such as the brain's sensory processing networks.
first_indexed 2024-09-23T15:53:51Z
format Article
id mit-1721.1/135814
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T15:53:51Z
publishDate 2021
publisher IEEE
record_format dspace
spelling mit-1721.1/1358142023-01-11T19:54:58Z Teacher Guided Architecture Search Bashivan, Pouya Tensen, Mark Dicarlo, James Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences McGovern Institute for Brain Research at MIT © 2019 IEEE. Much of the recent improvement in neural networks for computer vision has resulted from discovery of new networks architectures. Most prior work has used the performance of candidate models following limited training to automatically guide the search in a feasible way. Could further gains in computational efficiency be achieved by guiding the search via measurements of a high performing network with unknown detailed architecture (e.g. the primate visual system)? As one step toward this goal, we use representational similarity analysis to evaluate the similarity of internal activations of candidate networks with those of a (fixed, high performing) teacher network. We show that adopting this evaluation metric could produce up to an order of magnitude in search efficiency over performance-guided methods. Our approach finds a convolutional cell structure with similar performance as was previously found using other methods but at a total computational cost that is two orders of magnitude lower than Neural Architecture Search (NAS) and more than four times lower than progressive neural architecture search (PNAS). We further show that measurements from only ∼300 neurons from primate visual system provides enough signal to find a network with an Imagenet top-1 error that is significantly lower than that achieved by performance-guided architecture search alone. These results suggest that representational matching can be used to accelerate network architecture search in cases where one has access to some or all of the internal representations of a teacher network of interest, such as the brain's sensory processing networks. 2021-10-27T20:29:26Z 2021-10-27T20:29:26Z 2019 2021-04-15T17:44:32Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/135814 en 10.1109/ICCV.2019.00542 Proceedings of the IEEE International Conference on Computer Vision Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf IEEE arXiv
spellingShingle Bashivan, Pouya
Tensen, Mark
Dicarlo, James
Teacher Guided Architecture Search
title Teacher Guided Architecture Search
title_full Teacher Guided Architecture Search
title_fullStr Teacher Guided Architecture Search
title_full_unstemmed Teacher Guided Architecture Search
title_short Teacher Guided Architecture Search
title_sort teacher guided architecture search
url https://hdl.handle.net/1721.1/135814
work_keys_str_mv AT bashivanpouya teacherguidedarchitecturesearch
AT tensenmark teacherguidedarchitecturesearch
AT dicarlojames teacherguidedarchitecturesearch