A Lightweight, Arbitrary-oriented SAR Ship Detector via Feature Map-based Knowledge Distillation

In the Synthetic Aperture Radar (SAR) ship target detection task, the targets have a large aspect ratio and dense distribution, and they are arranged in arbitrary directions. The oriented bounding box-based detection methods can output accurate detection results. However, these methods are strongly...

Full description

Bibliographic Details
Main Authors: Shiqi CHEN, Wei WANG, Ronghui ZHAN, Jun ZHANG, Shengqi LIU
Format: Article
Language:English
Published: China Science Publishing & Media Ltd. (CSPM) 2023-02-01
Series:Leida xuebao
Subjects:
Online Access:https://radars.ac.cn/cn/article/doi/10.12000/JR21209
Description
Summary:In the Synthetic Aperture Radar (SAR) ship target detection task, the targets have a large aspect ratio and dense distribution, and they are arranged in arbitrary directions. The oriented bounding box-based detection methods can output accurate detection results. However, these methods are strongly restricted by high computational complexity, slow inference speed, and large storage consumption, which complicate their deployment on space-borne platforms. To solve the above issues, a lightweight oriented anchor-free-based detection method is proposed by combining feature map and prediction head knowledge distillation. First, we propose an improved Gaussian kernel based on the aspect ratio and angle information so that the generated heatmaps can better describe the shape of the targets. Second, the foreground region enhancement branch is introduced to make the network focus more on foreground features while suppressing the background interference. When training the lightweight student network, the similarity between pixels is treated as transferred knowledge in heatmap distillation. To tackle the imbalance between positive and negative samples in feature distillation, the foreground attention region is applied as a mask to guide the feature distillation process. In addition, a global semantic module is proposed to model the contextual information around pixels, and the background knowledge is combined to further strengthen the feature representation. Experimental results based on HRSID show that our method can achieve 80.71% mAP with only 9.07 M model parameters, and the detection frame rate meets the needs of real-time applications.
ISSN:2095-283X