A non-negative feedback self-distillation method for salient object detection

Self-distillation methods utilize Kullback-Leibler divergence (KL) loss to transfer the knowledge from the network itself, which can improve the model performance without increasing computational resources and complexity. However, when applied to salient object detection (SOD), it is difficult to ef...

Full description

Bibliographic Details
Main Authors: Lei Chen, Tieyong Cao, Yunfei Zheng, Jibin Yang, Yang Wang, Yekui Wang, Bo Zhang
Format: Article
Language:English
Published: PeerJ Inc. 2023-06-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-1435.pdf
Description
Summary:Self-distillation methods utilize Kullback-Leibler divergence (KL) loss to transfer the knowledge from the network itself, which can improve the model performance without increasing computational resources and complexity. However, when applied to salient object detection (SOD), it is difficult to effectively transfer knowledge using KL. In order to improve SOD model performance without increasing computational resources, a non-negative feedback self-distillation method is proposed. Firstly, a virtual teacher self-distillation method is proposed to enhance the model generalization, which achieves good results in pixel-wise classification task but has less improvement in SOD. Secondly, to understand the behavior of the self-distillation loss, the gradient directions of KL and Cross Entropy (CE) loss are analyzed. It is found that KL can create inconsistent gradients with the opposite direction to CE in SOD. Finally, a non-negative feedback loss is proposed for SOD, which uses different ways to calculate the distillation loss of the foreground and background respectively, to ensure that the teacher network transfers only positive knowledge to the student. The experiments on five datasets show that the proposed self-distillation methods can effectively improve the performance of SOD models, and the average Fβ is increased by about 2.7% compared with the baseline network.
ISSN:2376-5992