Multi-Fidelity Neural Architecture Search With Knowledge Distillation

Neural architecture search (NAS) targets at finding the optimal architecture of a neural network for a problem or a family of problems. Evaluations of neural architectures are very time-consuming. One of the possible ways to mitigate this issue is to use low-fidelity evaluations, namely training on...

Full description

Bibliographic Details
Main Authors: Ilya Trofimov, Nikita Klyuchnikov, Mikhail Salnikov, Alexander Filippov, Evgeny Burnaev
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10007805/
_version_ 1797799273360785408
author Ilya Trofimov
Nikita Klyuchnikov
Mikhail Salnikov
Alexander Filippov
Evgeny Burnaev
author_facet Ilya Trofimov
Nikita Klyuchnikov
Mikhail Salnikov
Alexander Filippov
Evgeny Burnaev
author_sort Ilya Trofimov
collection DOAJ
description Neural architecture search (NAS) targets at finding the optimal architecture of a neural network for a problem or a family of problems. Evaluations of neural architectures are very time-consuming. One of the possible ways to mitigate this issue is to use low-fidelity evaluations, namely training on a part of a dataset, fewer epochs, with fewer channels, etc. In this paper, we propose a Bayesian multi-fidelity (MF) method for neural architecture search: MF-KD. The method relies on a new approach to low-fidelity evaluations of neural architectures by training for a few epochs using a knowledge distillation (KD). Knowledge distillation adds to a loss function a term forcing a network to mimic some teacher network. We carry out experiments on CIFAR-10, CIFAR-100, and ImageNet-16-120. We show that training for a few epochs with such a modified loss function leads to a better selection of neural architectures than training for a few epochs with a logistic loss. The proposed method outperforms several state-of-the-art baselines.
first_indexed 2024-03-13T04:17:27Z
format Article
id doaj.art-2aea4af24d8d4f0fa7042fffa9540e20
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-13T04:17:27Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-2aea4af24d8d4f0fa7042fffa9540e202023-06-20T23:00:27ZengIEEEIEEE Access2169-35362023-01-0111592175922510.1109/ACCESS.2023.323481010007805Multi-Fidelity Neural Architecture Search With Knowledge DistillationIlya Trofimov0https://orcid.org/0000-0002-2961-7368Nikita Klyuchnikov1https://orcid.org/0000-0001-5065-4000Mikhail Salnikov2Alexander Filippov3https://orcid.org/0000-0002-9826-2425Evgeny Burnaev4https://orcid.org/0000-0001-8424-0690Skolkovo Institute of Science and Technology, Moscow, RussiaSkolkovo Institute of Science and Technology, Moscow, RussiaSkolkovo Institute of Science and Technology, Moscow, RussiaHuawei, Moscow, RussiaSkolkovo Institute of Science and Technology, Moscow, RussiaNeural architecture search (NAS) targets at finding the optimal architecture of a neural network for a problem or a family of problems. Evaluations of neural architectures are very time-consuming. One of the possible ways to mitigate this issue is to use low-fidelity evaluations, namely training on a part of a dataset, fewer epochs, with fewer channels, etc. In this paper, we propose a Bayesian multi-fidelity (MF) method for neural architecture search: MF-KD. The method relies on a new approach to low-fidelity evaluations of neural architectures by training for a few epochs using a knowledge distillation (KD). Knowledge distillation adds to a loss function a term forcing a network to mimic some teacher network. We carry out experiments on CIFAR-10, CIFAR-100, and ImageNet-16-120. We show that training for a few epochs with such a modified loss function leads to a better selection of neural architectures than training for a few epochs with a logistic loss. The proposed method outperforms several state-of-the-art baselines.https://ieeexplore.ieee.org/document/10007805/Bayesian optimizationknowledge distillationmulti-fidelity optimizationneural architecture search
spellingShingle Ilya Trofimov
Nikita Klyuchnikov
Mikhail Salnikov
Alexander Filippov
Evgeny Burnaev
Multi-Fidelity Neural Architecture Search With Knowledge Distillation
IEEE Access
Bayesian optimization
knowledge distillation
multi-fidelity optimization
neural architecture search
title Multi-Fidelity Neural Architecture Search With Knowledge Distillation
title_full Multi-Fidelity Neural Architecture Search With Knowledge Distillation
title_fullStr Multi-Fidelity Neural Architecture Search With Knowledge Distillation
title_full_unstemmed Multi-Fidelity Neural Architecture Search With Knowledge Distillation
title_short Multi-Fidelity Neural Architecture Search With Knowledge Distillation
title_sort multi fidelity neural architecture search with knowledge distillation
topic Bayesian optimization
knowledge distillation
multi-fidelity optimization
neural architecture search
url https://ieeexplore.ieee.org/document/10007805/
work_keys_str_mv AT ilyatrofimov multifidelityneuralarchitecturesearchwithknowledgedistillation
AT nikitaklyuchnikov multifidelityneuralarchitecturesearchwithknowledgedistillation
AT mikhailsalnikov multifidelityneuralarchitecturesearchwithknowledgedistillation
AT alexanderfilippov multifidelityneuralarchitecturesearchwithknowledgedistillation
AT evgenyburnaev multifidelityneuralarchitecturesearchwithknowledgedistillation