Decoupled knowledge distillation method based on meta-learning

With the advancement of deep learning techniques, the number of model parameters has been increasing, leading to significant memory consumption and limits in the deployment of such models in real-time applications. To reduce the number of model parameters and enhance the generalization capability of...

Full description

Bibliographic Details
Main Authors:	Wenqing Du, Liting Geng, Jianxiong Liu, Zhigang Zhao, Chunxiao Wang, Jidong Huo
Format:	Article
Language:	English
Published:	Elsevier 2024-03-01
Series:	High-Confidence Computing
Subjects:	Model compression Knowledge distillation Meta-learning Decoupled loss
Online Access:	http://www.sciencedirect.com/science/article/pii/S2667295223000624

_version_	1797237905445355520
author	Wenqing Du Liting Geng Jianxiong Liu Zhigang Zhao Chunxiao Wang Jidong Huo
author_facet	Wenqing Du Liting Geng Jianxiong Liu Zhigang Zhao Chunxiao Wang Jidong Huo
author_sort	Wenqing Du
collection	DOAJ
description	With the advancement of deep learning techniques, the number of model parameters has been increasing, leading to significant memory consumption and limits in the deployment of such models in real-time applications. To reduce the number of model parameters and enhance the generalization capability of neural networks, we propose a method called Decoupled MetaDistil, which involves decoupled meta-distillation. This method utilizes meta-learning to guide the teacher model and dynamically adjusts the knowledge transfer strategy based on feedback from the student model, thereby improving the generalization ability. Furthermore, we introduce a decoupled loss method to explicitly transfer positive sample knowledge and explore the potential of negative samples knowledge. Extensive experiments demonstrate the effectiveness of our method.
first_indexed	2024-03-09T14:25:03Z
format	Article
id	doaj.art-96e55891cf7d44baac838d7f26e9bfad
institution	Directory Open Access Journal
issn	2667-2952
language	English
last_indexed	2024-04-24T17:27:10Z
publishDate	2024-03-01
publisher	Elsevier
record_format	Article
series	High-Confidence Computing
spelling	doaj.art-96e55891cf7d44baac838d7f26e9bfad2024-03-28T06:39:20ZengElsevierHigh-Confidence Computing2667-29522024-03-0141100164Decoupled knowledge distillation method based on meta-learningWenqing Du0Liting Geng1Jianxiong Liu2Zhigang Zhao3Chunxiao Wang4Jidong Huo5Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, ChinaKey Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, ChinaAerospace Science & Industry Network Information Development Co., LTD, Beijing 100854, ChinaKey Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, China; Corresponding authors.Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, China; Corresponding authors.Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, ChinaWith the advancement of deep learning techniques, the number of model parameters has been increasing, leading to significant memory consumption and limits in the deployment of such models in real-time applications. To reduce the number of model parameters and enhance the generalization capability of neural networks, we propose a method called Decoupled MetaDistil, which involves decoupled meta-distillation. This method utilizes meta-learning to guide the teacher model and dynamically adjusts the knowledge transfer strategy based on feedback from the student model, thereby improving the generalization ability. Furthermore, we introduce a decoupled loss method to explicitly transfer positive sample knowledge and explore the potential of negative samples knowledge. Extensive experiments demonstrate the effectiveness of our method.http://www.sciencedirect.com/science/article/pii/S2667295223000624Model compressionKnowledge distillationMeta-learningDecoupled loss
spellingShingle	Wenqing Du Liting Geng Jianxiong Liu Zhigang Zhao Chunxiao Wang Jidong Huo Decoupled knowledge distillation method based on meta-learning High-Confidence Computing Model compression Knowledge distillation Meta-learning Decoupled loss
title	Decoupled knowledge distillation method based on meta-learning
title_full	Decoupled knowledge distillation method based on meta-learning
title_fullStr	Decoupled knowledge distillation method based on meta-learning
title_full_unstemmed	Decoupled knowledge distillation method based on meta-learning
title_short	Decoupled knowledge distillation method based on meta-learning
title_sort	decoupled knowledge distillation method based on meta learning
topic	Model compression Knowledge distillation Meta-learning Decoupled loss
url	http://www.sciencedirect.com/science/article/pii/S2667295223000624
work_keys_str_mv	AT wenqingdu decoupledknowledgedistillationmethodbasedonmetalearning AT litinggeng decoupledknowledgedistillationmethodbasedonmetalearning AT jianxiongliu decoupledknowledgedistillationmethodbasedonmetalearning AT zhigangzhao decoupledknowledgedistillationmethodbasedonmetalearning AT chunxiaowang decoupledknowledgedistillationmethodbasedonmetalearning AT jidonghuo decoupledknowledgedistillationmethodbasedonmetalearning

Decoupled knowledge distillation method based on meta-learning

Similar Items