Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems

As the complexity of Deep Neural Network (DNN) models increases, their deployment on mobile devices becomes increasingly challenging, especially in complex vision tasks such as image classification. Many of recent contributions aim either to produce compact models matching the limited computing capa...

Full description

Bibliographic Details
Main Authors: Yoshitomo Matsubara, Davide Callegaro, Sabur Baidya, Marco Levorato, Sameer Singh
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9265295/
_version_ 1818444622870872064
author Yoshitomo Matsubara
Davide Callegaro
Sabur Baidya
Marco Levorato
Sameer Singh
author_facet Yoshitomo Matsubara
Davide Callegaro
Sabur Baidya
Marco Levorato
Sameer Singh
author_sort Yoshitomo Matsubara
collection DOAJ
description As the complexity of Deep Neural Network (DNN) models increases, their deployment on mobile devices becomes increasingly challenging, especially in complex vision tasks such as image classification. Many of recent contributions aim either to produce compact models matching the limited computing capabilities of mobile devices or to offload the execution of such burdensome models to a compute-capable device at the network edge - the edge servers. In this paper, we propose to modify the structure and training process of DNN models for complex image classification tasks to achieve in-network compression in the early network layers. Our training process stems from knowledge distillation, a technique that has been traditionally used to build small - student - models mimicking the output of larger - teacher - models. Here, we adopt this idea to obtain aggressive compression while preserving accuracy. Our results demonstrate that our approach is effective for state-of-the-art models trained over complex datasets, and can extend the parameter region in which edge computing is a viable and advantageous option. Additionally, we demonstrate that in many settings of practical interest we reduce the inference time with respect to specialized models such as MobileNet v2 executed at the mobile device, while improving accuracy.
first_indexed 2024-12-14T19:18:52Z
format Article
id doaj.art-d9aa2f2c37dd4e95ba670ec7b19d0034
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-14T19:18:52Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-d9aa2f2c37dd4e95ba670ec7b19d00342022-12-21T22:50:26ZengIEEEIEEE Access2169-35362020-01-01821217721219310.1109/ACCESS.2020.30397149265295Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing SystemsYoshitomo Matsubara0https://orcid.org/0000-0002-5620-0760Davide Callegaro1https://orcid.org/0000-0003-4237-7845Sabur Baidya2Marco Levorato3https://orcid.org/0000-0002-6920-4189Sameer Singh4https://orcid.org/0000-0003-0621-6323Department of Computer Science, University of California, Irvine, CA, USADepartment of Computer Science, University of California, Irvine, CA, USAElectrical and Computer Engineering Department, University of California, San Diego, CA, USADepartment of Computer Science, University of California, Irvine, CA, USADepartment of Computer Science, University of California, Irvine, CA, USAAs the complexity of Deep Neural Network (DNN) models increases, their deployment on mobile devices becomes increasingly challenging, especially in complex vision tasks such as image classification. Many of recent contributions aim either to produce compact models matching the limited computing capabilities of mobile devices or to offload the execution of such burdensome models to a compute-capable device at the network edge - the edge servers. In this paper, we propose to modify the structure and training process of DNN models for complex image classification tasks to achieve in-network compression in the early network layers. Our training process stems from knowledge distillation, a technique that has been traditionally used to build small - student - models mimicking the output of larger - teacher - models. Here, we adopt this idea to obtain aggressive compression while preserving accuracy. Our results demonstrate that our approach is effective for state-of-the-art models trained over complex datasets, and can extend the parameter region in which edge computing is a viable and advantageous option. Additionally, we demonstrate that in many settings of practical interest we reduce the inference time with respect to specialized models such as MobileNet v2 executed at the mobile device, while improving accuracy.https://ieeexplore.ieee.org/document/9265295/Deep neural networksedge computinghead network distillationknowledge distillation
spellingShingle Yoshitomo Matsubara
Davide Callegaro
Sabur Baidya
Marco Levorato
Sameer Singh
Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems
IEEE Access
Deep neural networks
edge computing
head network distillation
knowledge distillation
title Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems
title_full Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems
title_fullStr Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems
title_full_unstemmed Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems
title_short Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems
title_sort head network distillation splitting distilled deep neural networks for resource constrained edge computing systems
topic Deep neural networks
edge computing
head network distillation
knowledge distillation
url https://ieeexplore.ieee.org/document/9265295/
work_keys_str_mv AT yoshitomomatsubara headnetworkdistillationsplittingdistilleddeepneuralnetworksforresourceconstrainededgecomputingsystems
AT davidecallegaro headnetworkdistillationsplittingdistilleddeepneuralnetworksforresourceconstrainededgecomputingsystems
AT saburbaidya headnetworkdistillationsplittingdistilleddeepneuralnetworksforresourceconstrainededgecomputingsystems
AT marcolevorato headnetworkdistillationsplittingdistilleddeepneuralnetworksforresourceconstrainededgecomputingsystems
AT sameersingh headnetworkdistillationsplittingdistilleddeepneuralnetworksforresourceconstrainededgecomputingsystems