Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems
As the complexity of Deep Neural Network (DNN) models increases, their deployment on mobile devices becomes increasingly challenging, especially in complex vision tasks such as image classification. Many of recent contributions aim either to produce compact models matching the limited computing capa...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9265295/ |
_version_ | 1818444622870872064 |
---|---|
author | Yoshitomo Matsubara Davide Callegaro Sabur Baidya Marco Levorato Sameer Singh |
author_facet | Yoshitomo Matsubara Davide Callegaro Sabur Baidya Marco Levorato Sameer Singh |
author_sort | Yoshitomo Matsubara |
collection | DOAJ |
description | As the complexity of Deep Neural Network (DNN) models increases, their deployment on mobile devices becomes increasingly challenging, especially in complex vision tasks such as image classification. Many of recent contributions aim either to produce compact models matching the limited computing capabilities of mobile devices or to offload the execution of such burdensome models to a compute-capable device at the network edge - the edge servers. In this paper, we propose to modify the structure and training process of DNN models for complex image classification tasks to achieve in-network compression in the early network layers. Our training process stems from knowledge distillation, a technique that has been traditionally used to build small - student - models mimicking the output of larger - teacher - models. Here, we adopt this idea to obtain aggressive compression while preserving accuracy. Our results demonstrate that our approach is effective for state-of-the-art models trained over complex datasets, and can extend the parameter region in which edge computing is a viable and advantageous option. Additionally, we demonstrate that in many settings of practical interest we reduce the inference time with respect to specialized models such as MobileNet v2 executed at the mobile device, while improving accuracy. |
first_indexed | 2024-12-14T19:18:52Z |
format | Article |
id | doaj.art-d9aa2f2c37dd4e95ba670ec7b19d0034 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-14T19:18:52Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-d9aa2f2c37dd4e95ba670ec7b19d00342022-12-21T22:50:26ZengIEEEIEEE Access2169-35362020-01-01821217721219310.1109/ACCESS.2020.30397149265295Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing SystemsYoshitomo Matsubara0https://orcid.org/0000-0002-5620-0760Davide Callegaro1https://orcid.org/0000-0003-4237-7845Sabur Baidya2Marco Levorato3https://orcid.org/0000-0002-6920-4189Sameer Singh4https://orcid.org/0000-0003-0621-6323Department of Computer Science, University of California, Irvine, CA, USADepartment of Computer Science, University of California, Irvine, CA, USAElectrical and Computer Engineering Department, University of California, San Diego, CA, USADepartment of Computer Science, University of California, Irvine, CA, USADepartment of Computer Science, University of California, Irvine, CA, USAAs the complexity of Deep Neural Network (DNN) models increases, their deployment on mobile devices becomes increasingly challenging, especially in complex vision tasks such as image classification. Many of recent contributions aim either to produce compact models matching the limited computing capabilities of mobile devices or to offload the execution of such burdensome models to a compute-capable device at the network edge - the edge servers. In this paper, we propose to modify the structure and training process of DNN models for complex image classification tasks to achieve in-network compression in the early network layers. Our training process stems from knowledge distillation, a technique that has been traditionally used to build small - student - models mimicking the output of larger - teacher - models. Here, we adopt this idea to obtain aggressive compression while preserving accuracy. Our results demonstrate that our approach is effective for state-of-the-art models trained over complex datasets, and can extend the parameter region in which edge computing is a viable and advantageous option. Additionally, we demonstrate that in many settings of practical interest we reduce the inference time with respect to specialized models such as MobileNet v2 executed at the mobile device, while improving accuracy.https://ieeexplore.ieee.org/document/9265295/Deep neural networksedge computinghead network distillationknowledge distillation |
spellingShingle | Yoshitomo Matsubara Davide Callegaro Sabur Baidya Marco Levorato Sameer Singh Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems IEEE Access Deep neural networks edge computing head network distillation knowledge distillation |
title | Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems |
title_full | Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems |
title_fullStr | Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems |
title_full_unstemmed | Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems |
title_short | Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems |
title_sort | head network distillation splitting distilled deep neural networks for resource constrained edge computing systems |
topic | Deep neural networks edge computing head network distillation knowledge distillation |
url | https://ieeexplore.ieee.org/document/9265295/ |
work_keys_str_mv | AT yoshitomomatsubara headnetworkdistillationsplittingdistilleddeepneuralnetworksforresourceconstrainededgecomputingsystems AT davidecallegaro headnetworkdistillationsplittingdistilleddeepneuralnetworksforresourceconstrainededgecomputingsystems AT saburbaidya headnetworkdistillationsplittingdistilleddeepneuralnetworksforresourceconstrainededgecomputingsystems AT marcolevorato headnetworkdistillationsplittingdistilleddeepneuralnetworksforresourceconstrainededgecomputingsystems AT sameersingh headnetworkdistillationsplittingdistilleddeepneuralnetworksforresourceconstrainededgecomputingsystems |