Summary: | The detection of fine-grained objects in remote sensing images has been recognized as a challenging issue, which cannot be well addressed by the existing deep learning-based methods due to their inadaptability to multi-scale objects, slow convergence speed, and limitations to scarce datasets. To cope with the above issue, we propose a deep convolutional neural network (CNN)-based detection method for multi-scale fine-grained objects in complex remote sensing scenarios. Specifically, we adopt a deep CNN with a residual structure as the backbone network, to extract deep-level details from the image. Besides, we introduce a multi-scale region generation network to overcome the limitations of fixed receptive field convolution kernels and enable multi-scale object detection. Lastly, we replace fully connected layers in the fully convolutional region classification network with <inline-formula> <tex-math notation="LaTeX">$1\times 1$ </tex-math></inline-formula> convolutional layer to enhance detection efficiency and detection speed. To overcome the limitation of scarce datasets, we conducted experiments on the FAIR1M dataset, which is currently the largest fine-grained object detection dataset in the remote sensing field. Simulation results show that the proposed detection method achieves the highest average precision (35.86%) among all benchmarks and outperforms the classic Faster R-CNN-based method by 3.44%. Furthermore, our method demonstrates significantly improved detection speed compared to the Faster R-CNN-based methods.
|