Decoupled knowledge distillation method based on meta-learning

With the advancement of deep learning techniques, the number of model parameters has been increasing, leading to significant memory consumption and limits in the deployment of such models in real-time applications. To reduce the number of model parameters and enhance the generalization capability of...

Full description

Bibliographic Details
Main Authors: Wenqing Du, Liting Geng, Jianxiong Liu, Zhigang Zhao, Chunxiao Wang, Jidong Huo
Format: Article
Language:English
Published: Elsevier 2024-03-01
Series:High-Confidence Computing
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2667295223000624
_version_ 1797237905445355520
author Wenqing Du
Liting Geng
Jianxiong Liu
Zhigang Zhao
Chunxiao Wang
Jidong Huo
author_facet Wenqing Du
Liting Geng
Jianxiong Liu
Zhigang Zhao
Chunxiao Wang
Jidong Huo
author_sort Wenqing Du
collection DOAJ
description With the advancement of deep learning techniques, the number of model parameters has been increasing, leading to significant memory consumption and limits in the deployment of such models in real-time applications. To reduce the number of model parameters and enhance the generalization capability of neural networks, we propose a method called Decoupled MetaDistil, which involves decoupled meta-distillation. This method utilizes meta-learning to guide the teacher model and dynamically adjusts the knowledge transfer strategy based on feedback from the student model, thereby improving the generalization ability. Furthermore, we introduce a decoupled loss method to explicitly transfer positive sample knowledge and explore the potential of negative samples knowledge. Extensive experiments demonstrate the effectiveness of our method.
first_indexed 2024-03-09T14:25:03Z
format Article
id doaj.art-96e55891cf7d44baac838d7f26e9bfad
institution Directory Open Access Journal
issn 2667-2952
language English
last_indexed 2024-04-24T17:27:10Z
publishDate 2024-03-01
publisher Elsevier
record_format Article
series High-Confidence Computing
spelling doaj.art-96e55891cf7d44baac838d7f26e9bfad2024-03-28T06:39:20ZengElsevierHigh-Confidence Computing2667-29522024-03-0141100164Decoupled knowledge distillation method based on meta-learningWenqing Du0Liting Geng1Jianxiong Liu2Zhigang Zhao3Chunxiao Wang4Jidong Huo5Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, ChinaKey Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, ChinaAerospace Science & Industry Network Information Development Co., LTD, Beijing 100854, ChinaKey Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, China; Corresponding authors.Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, China; Corresponding authors.Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, ChinaWith the advancement of deep learning techniques, the number of model parameters has been increasing, leading to significant memory consumption and limits in the deployment of such models in real-time applications. To reduce the number of model parameters and enhance the generalization capability of neural networks, we propose a method called Decoupled MetaDistil, which involves decoupled meta-distillation. This method utilizes meta-learning to guide the teacher model and dynamically adjusts the knowledge transfer strategy based on feedback from the student model, thereby improving the generalization ability. Furthermore, we introduce a decoupled loss method to explicitly transfer positive sample knowledge and explore the potential of negative samples knowledge. Extensive experiments demonstrate the effectiveness of our method.http://www.sciencedirect.com/science/article/pii/S2667295223000624Model compressionKnowledge distillationMeta-learningDecoupled loss
spellingShingle Wenqing Du
Liting Geng
Jianxiong Liu
Zhigang Zhao
Chunxiao Wang
Jidong Huo
Decoupled knowledge distillation method based on meta-learning
High-Confidence Computing
Model compression
Knowledge distillation
Meta-learning
Decoupled loss
title Decoupled knowledge distillation method based on meta-learning
title_full Decoupled knowledge distillation method based on meta-learning
title_fullStr Decoupled knowledge distillation method based on meta-learning
title_full_unstemmed Decoupled knowledge distillation method based on meta-learning
title_short Decoupled knowledge distillation method based on meta-learning
title_sort decoupled knowledge distillation method based on meta learning
topic Model compression
Knowledge distillation
Meta-learning
Decoupled loss
url http://www.sciencedirect.com/science/article/pii/S2667295223000624
work_keys_str_mv AT wenqingdu decoupledknowledgedistillationmethodbasedonmetalearning
AT litinggeng decoupledknowledgedistillationmethodbasedonmetalearning
AT jianxiongliu decoupledknowledgedistillationmethodbasedonmetalearning
AT zhigangzhao decoupledknowledgedistillationmethodbasedonmetalearning
AT chunxiaowang decoupledknowledgedistillationmethodbasedonmetalearning
AT jidonghuo decoupledknowledgedistillationmethodbasedonmetalearning