Multi-Fidelity Neural Architecture Search With Knowledge Distillation
Neural architecture search (NAS) targets at finding the optimal architecture of a neural network for a problem or a family of problems. Evaluations of neural architectures are very time-consuming. One of the possible ways to mitigate this issue is to use low-fidelity evaluations, namely training on...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10007805/ |
_version_ | 1797799273360785408 |
---|---|
author | Ilya Trofimov Nikita Klyuchnikov Mikhail Salnikov Alexander Filippov Evgeny Burnaev |
author_facet | Ilya Trofimov Nikita Klyuchnikov Mikhail Salnikov Alexander Filippov Evgeny Burnaev |
author_sort | Ilya Trofimov |
collection | DOAJ |
description | Neural architecture search (NAS) targets at finding the optimal architecture of a neural network for a problem or a family of problems. Evaluations of neural architectures are very time-consuming. One of the possible ways to mitigate this issue is to use low-fidelity evaluations, namely training on a part of a dataset, fewer epochs, with fewer channels, etc. In this paper, we propose a Bayesian multi-fidelity (MF) method for neural architecture search: MF-KD. The method relies on a new approach to low-fidelity evaluations of neural architectures by training for a few epochs using a knowledge distillation (KD). Knowledge distillation adds to a loss function a term forcing a network to mimic some teacher network. We carry out experiments on CIFAR-10, CIFAR-100, and ImageNet-16-120. We show that training for a few epochs with such a modified loss function leads to a better selection of neural architectures than training for a few epochs with a logistic loss. The proposed method outperforms several state-of-the-art baselines. |
first_indexed | 2024-03-13T04:17:27Z |
format | Article |
id | doaj.art-2aea4af24d8d4f0fa7042fffa9540e20 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-13T04:17:27Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-2aea4af24d8d4f0fa7042fffa9540e202023-06-20T23:00:27ZengIEEEIEEE Access2169-35362023-01-0111592175922510.1109/ACCESS.2023.323481010007805Multi-Fidelity Neural Architecture Search With Knowledge DistillationIlya Trofimov0https://orcid.org/0000-0002-2961-7368Nikita Klyuchnikov1https://orcid.org/0000-0001-5065-4000Mikhail Salnikov2Alexander Filippov3https://orcid.org/0000-0002-9826-2425Evgeny Burnaev4https://orcid.org/0000-0001-8424-0690Skolkovo Institute of Science and Technology, Moscow, RussiaSkolkovo Institute of Science and Technology, Moscow, RussiaSkolkovo Institute of Science and Technology, Moscow, RussiaHuawei, Moscow, RussiaSkolkovo Institute of Science and Technology, Moscow, RussiaNeural architecture search (NAS) targets at finding the optimal architecture of a neural network for a problem or a family of problems. Evaluations of neural architectures are very time-consuming. One of the possible ways to mitigate this issue is to use low-fidelity evaluations, namely training on a part of a dataset, fewer epochs, with fewer channels, etc. In this paper, we propose a Bayesian multi-fidelity (MF) method for neural architecture search: MF-KD. The method relies on a new approach to low-fidelity evaluations of neural architectures by training for a few epochs using a knowledge distillation (KD). Knowledge distillation adds to a loss function a term forcing a network to mimic some teacher network. We carry out experiments on CIFAR-10, CIFAR-100, and ImageNet-16-120. We show that training for a few epochs with such a modified loss function leads to a better selection of neural architectures than training for a few epochs with a logistic loss. The proposed method outperforms several state-of-the-art baselines.https://ieeexplore.ieee.org/document/10007805/Bayesian optimizationknowledge distillationmulti-fidelity optimizationneural architecture search |
spellingShingle | Ilya Trofimov Nikita Klyuchnikov Mikhail Salnikov Alexander Filippov Evgeny Burnaev Multi-Fidelity Neural Architecture Search With Knowledge Distillation IEEE Access Bayesian optimization knowledge distillation multi-fidelity optimization neural architecture search |
title | Multi-Fidelity Neural Architecture Search With Knowledge Distillation |
title_full | Multi-Fidelity Neural Architecture Search With Knowledge Distillation |
title_fullStr | Multi-Fidelity Neural Architecture Search With Knowledge Distillation |
title_full_unstemmed | Multi-Fidelity Neural Architecture Search With Knowledge Distillation |
title_short | Multi-Fidelity Neural Architecture Search With Knowledge Distillation |
title_sort | multi fidelity neural architecture search with knowledge distillation |
topic | Bayesian optimization knowledge distillation multi-fidelity optimization neural architecture search |
url | https://ieeexplore.ieee.org/document/10007805/ |
work_keys_str_mv | AT ilyatrofimov multifidelityneuralarchitecturesearchwithknowledgedistillation AT nikitaklyuchnikov multifidelityneuralarchitecturesearchwithknowledgedistillation AT mikhailsalnikov multifidelityneuralarchitecturesearchwithknowledgedistillation AT alexanderfilippov multifidelityneuralarchitecturesearchwithknowledgedistillation AT evgenyburnaev multifidelityneuralarchitecturesearchwithknowledgedistillation |