Bridging the Knowledge Gap via Transformer-Based Multi-Layer Correlation Learning

We tackle a multi-layer knowledge distillation problem between deep models with heterogeneous architectures. The main challenges of that are the mismatches of the feature maps in terms of the resolution or semantic levels. To resolve this, we propose a novel transformer-based multi-layer correlation...

Full description

Bibliographic Details
Main Authors: Hun-Beom Bak, Seung-Hwan Bae
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10497103/
Description
Summary:We tackle a multi-layer knowledge distillation problem between deep models with heterogeneous architectures. The main challenges of that are the mismatches of the feature maps in terms of the resolution or semantic levels. To resolve this, we propose a novel transformer-based multi-layer correlation knowledge distillation (TMC-KD) method in order to bridge the knowledge gap between a pair of networks. Our method aims to narrow the relational knowledge gaps between teacher and student models by minimizing the local and global feature correlations. Based on extensive comparisons with the recent KD methods on classification and detection tasks, we prove the effectiveness and usefulness of our TMC-KD method.
ISSN:2169-3536