Decoupled knowledge distillation method based on meta-learning
With the advancement of deep learning techniques, the number of model parameters has been increasing, leading to significant memory consumption and limits in the deployment of such models in real-time applications. To reduce the number of model parameters and enhance the generalization capability of...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2024-03-01
|
Series: | High-Confidence Computing |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2667295223000624 |
_version_ | 1797237905445355520 |
---|---|
author | Wenqing Du Liting Geng Jianxiong Liu Zhigang Zhao Chunxiao Wang Jidong Huo |
author_facet | Wenqing Du Liting Geng Jianxiong Liu Zhigang Zhao Chunxiao Wang Jidong Huo |
author_sort | Wenqing Du |
collection | DOAJ |
description | With the advancement of deep learning techniques, the number of model parameters has been increasing, leading to significant memory consumption and limits in the deployment of such models in real-time applications. To reduce the number of model parameters and enhance the generalization capability of neural networks, we propose a method called Decoupled MetaDistil, which involves decoupled meta-distillation. This method utilizes meta-learning to guide the teacher model and dynamically adjusts the knowledge transfer strategy based on feedback from the student model, thereby improving the generalization ability. Furthermore, we introduce a decoupled loss method to explicitly transfer positive sample knowledge and explore the potential of negative samples knowledge. Extensive experiments demonstrate the effectiveness of our method. |
first_indexed | 2024-03-09T14:25:03Z |
format | Article |
id | doaj.art-96e55891cf7d44baac838d7f26e9bfad |
institution | Directory Open Access Journal |
issn | 2667-2952 |
language | English |
last_indexed | 2024-04-24T17:27:10Z |
publishDate | 2024-03-01 |
publisher | Elsevier |
record_format | Article |
series | High-Confidence Computing |
spelling | doaj.art-96e55891cf7d44baac838d7f26e9bfad2024-03-28T06:39:20ZengElsevierHigh-Confidence Computing2667-29522024-03-0141100164Decoupled knowledge distillation method based on meta-learningWenqing Du0Liting Geng1Jianxiong Liu2Zhigang Zhao3Chunxiao Wang4Jidong Huo5Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, ChinaKey Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, ChinaAerospace Science & Industry Network Information Development Co., LTD, Beijing 100854, ChinaKey Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, China; Corresponding authors.Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, China; Corresponding authors.Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China; Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, ChinaWith the advancement of deep learning techniques, the number of model parameters has been increasing, leading to significant memory consumption and limits in the deployment of such models in real-time applications. To reduce the number of model parameters and enhance the generalization capability of neural networks, we propose a method called Decoupled MetaDistil, which involves decoupled meta-distillation. This method utilizes meta-learning to guide the teacher model and dynamically adjusts the knowledge transfer strategy based on feedback from the student model, thereby improving the generalization ability. Furthermore, we introduce a decoupled loss method to explicitly transfer positive sample knowledge and explore the potential of negative samples knowledge. Extensive experiments demonstrate the effectiveness of our method.http://www.sciencedirect.com/science/article/pii/S2667295223000624Model compressionKnowledge distillationMeta-learningDecoupled loss |
spellingShingle | Wenqing Du Liting Geng Jianxiong Liu Zhigang Zhao Chunxiao Wang Jidong Huo Decoupled knowledge distillation method based on meta-learning High-Confidence Computing Model compression Knowledge distillation Meta-learning Decoupled loss |
title | Decoupled knowledge distillation method based on meta-learning |
title_full | Decoupled knowledge distillation method based on meta-learning |
title_fullStr | Decoupled knowledge distillation method based on meta-learning |
title_full_unstemmed | Decoupled knowledge distillation method based on meta-learning |
title_short | Decoupled knowledge distillation method based on meta-learning |
title_sort | decoupled knowledge distillation method based on meta learning |
topic | Model compression Knowledge distillation Meta-learning Decoupled loss |
url | http://www.sciencedirect.com/science/article/pii/S2667295223000624 |
work_keys_str_mv | AT wenqingdu decoupledknowledgedistillationmethodbasedonmetalearning AT litinggeng decoupledknowledgedistillationmethodbasedonmetalearning AT jianxiongliu decoupledknowledgedistillationmethodbasedonmetalearning AT zhigangzhao decoupledknowledgedistillationmethodbasedonmetalearning AT chunxiaowang decoupledknowledgedistillationmethodbasedonmetalearning AT jidonghuo decoupledknowledgedistillationmethodbasedonmetalearning |