Teacher Guided Architecture Search

© 2019 IEEE. Much of the recent improvement in neural networks for computer vision has resulted from discovery of new networks architectures. Most prior work has used the performance of candidate models following limited training to automatically guide the search in a feasible way. Could further gai...

Full description

Bibliographic Details
Main Authors:	Bashivan, Pouya, Tensen, Mark, Dicarlo, James
Other Authors:	Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences
Format:	Article
Language:	English
Published:	IEEE 2021
Online Access:	https://hdl.handle.net/1721.1/135814

_version_	1826213727245959168
author	Bashivan, Pouya Tensen, Mark Dicarlo, James
author2	Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences
author_facet	Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences Bashivan, Pouya Tensen, Mark Dicarlo, James
author_sort	Bashivan, Pouya
collection	MIT
description	© 2019 IEEE. Much of the recent improvement in neural networks for computer vision has resulted from discovery of new networks architectures. Most prior work has used the performance of candidate models following limited training to automatically guide the search in a feasible way. Could further gains in computational efficiency be achieved by guiding the search via measurements of a high performing network with unknown detailed architecture (e.g. the primate visual system)? As one step toward this goal, we use representational similarity analysis to evaluate the similarity of internal activations of candidate networks with those of a (fixed, high performing) teacher network. We show that adopting this evaluation metric could produce up to an order of magnitude in search efficiency over performance-guided methods. Our approach finds a convolutional cell structure with similar performance as was previously found using other methods but at a total computational cost that is two orders of magnitude lower than Neural Architecture Search (NAS) and more than four times lower than progressive neural architecture search (PNAS). We further show that measurements from only ∼300 neurons from primate visual system provides enough signal to find a network with an Imagenet top-1 error that is significantly lower than that achieved by performance-guided architecture search alone. These results suggest that representational matching can be used to accelerate network architecture search in cases where one has access to some or all of the internal representations of a teacher network of interest, such as the brain's sensory processing networks.
first_indexed	2024-09-23T15:53:51Z
format	Article
id	mit-1721.1/135814
institution	Massachusetts Institute of Technology
language	English
last_indexed	2024-09-23T15:53:51Z
publishDate	2021
publisher	IEEE
record_format	dspace
spelling	mit-1721.1/1358142023-01-11T19:54:58Z Teacher Guided Architecture Search Bashivan, Pouya Tensen, Mark Dicarlo, James Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences McGovern Institute for Brain Research at MIT © 2019 IEEE. Much of the recent improvement in neural networks for computer vision has resulted from discovery of new networks architectures. Most prior work has used the performance of candidate models following limited training to automatically guide the search in a feasible way. Could further gains in computational efficiency be achieved by guiding the search via measurements of a high performing network with unknown detailed architecture (e.g. the primate visual system)? As one step toward this goal, we use representational similarity analysis to evaluate the similarity of internal activations of candidate networks with those of a (fixed, high performing) teacher network. We show that adopting this evaluation metric could produce up to an order of magnitude in search efficiency over performance-guided methods. Our approach finds a convolutional cell structure with similar performance as was previously found using other methods but at a total computational cost that is two orders of magnitude lower than Neural Architecture Search (NAS) and more than four times lower than progressive neural architecture search (PNAS). We further show that measurements from only ∼300 neurons from primate visual system provides enough signal to find a network with an Imagenet top-1 error that is significantly lower than that achieved by performance-guided architecture search alone. These results suggest that representational matching can be used to accelerate network architecture search in cases where one has access to some or all of the internal representations of a teacher network of interest, such as the brain's sensory processing networks. 2021-10-27T20:29:26Z 2021-10-27T20:29:26Z 2019 2021-04-15T17:44:32Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/135814 en 10.1109/ICCV.2019.00542 Proceedings of the IEEE International Conference on Computer Vision Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf IEEE arXiv
spellingShingle	Bashivan, Pouya Tensen, Mark Dicarlo, James Teacher Guided Architecture Search
title	Teacher Guided Architecture Search
title_full	Teacher Guided Architecture Search
title_fullStr	Teacher Guided Architecture Search
title_full_unstemmed	Teacher Guided Architecture Search
title_short	Teacher Guided Architecture Search
title_sort	teacher guided architecture search
url	https://hdl.handle.net/1721.1/135814
work_keys_str_mv	AT bashivanpouya teacherguidedarchitecturesearch AT tensenmark teacherguidedarchitecturesearch AT dicarlojames teacherguidedarchitecturesearch

Teacher Guided Architecture Search

Similar Items