INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection
Pedestrian detection is a critical task for safety-critical systems, but detecting pedestrians is challenging in low-light and adverse weather conditions. Thermal images can be used to improve robustness by providing complementary information to RGB images. Previous studies have shown that multi-mod...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-02-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/24/4/1168 |
_version_ | 1797297019798159360 |
---|---|
author | Sangin Lee Taejoo Kim Jeongmin Shin Namil Kim Yukyung Choi |
author_facet | Sangin Lee Taejoo Kim Jeongmin Shin Namil Kim Yukyung Choi |
author_sort | Sangin Lee |
collection | DOAJ |
description | Pedestrian detection is a critical task for safety-critical systems, but detecting pedestrians is challenging in low-light and adverse weather conditions. Thermal images can be used to improve robustness by providing complementary information to RGB images. Previous studies have shown that multi-modal feature fusion using convolution operation can be effective, but such methods rely solely on local feature correlations, which can degrade the performance capabilities. To address this issue, we propose an attention-based novel fusion network, referred to as INSANet (INtra-INter Spectral Attention Network), that captures global intra- and inter-information. It consists of intra- and inter-spectral attention blocks that allow the model to learn mutual spectral relationships. Additionally, we identified an imbalance in the multispectral dataset caused by several factors and designed an augmentation strategy that mitigates concentrated distributions and enables the model to learn the diverse locations of pedestrians. Extensive experiments demonstrate the effectiveness of the proposed methods, which achieve state-of-the-art performance on the KAIST dataset and LLVIP dataset. Finally, we conduct a regional performance evaluation to demonstrate the effectiveness of our proposed network in various regions. |
first_indexed | 2024-03-07T22:15:10Z |
format | Article |
id | doaj.art-a007b7c6c73048daaf5822544209444c |
institution | Directory Open Access Journal |
issn | 1424-8220 |
language | English |
last_indexed | 2024-03-07T22:15:10Z |
publishDate | 2024-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Sensors |
spelling | doaj.art-a007b7c6c73048daaf5822544209444c2024-02-23T15:33:45ZengMDPI AGSensors1424-82202024-02-01244116810.3390/s24041168INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian DetectionSangin Lee0Taejoo Kim1Jeongmin Shin2Namil Kim3Yukyung Choi4Department of Software, Sejong University, Seoul 05006, Republic of KoreaDepartment of Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, Republic of KoreaDepartment of Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, Republic of KoreaNAVER LABS, Seongnam 13561, Republic of KoreaDepartment of Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, Republic of KoreaPedestrian detection is a critical task for safety-critical systems, but detecting pedestrians is challenging in low-light and adverse weather conditions. Thermal images can be used to improve robustness by providing complementary information to RGB images. Previous studies have shown that multi-modal feature fusion using convolution operation can be effective, but such methods rely solely on local feature correlations, which can degrade the performance capabilities. To address this issue, we propose an attention-based novel fusion network, referred to as INSANet (INtra-INter Spectral Attention Network), that captures global intra- and inter-information. It consists of intra- and inter-spectral attention blocks that allow the model to learn mutual spectral relationships. Additionally, we identified an imbalance in the multispectral dataset caused by several factors and designed an augmentation strategy that mitigates concentrated distributions and enables the model to learn the diverse locations of pedestrians. Extensive experiments demonstrate the effectiveness of the proposed methods, which achieve state-of-the-art performance on the KAIST dataset and LLVIP dataset. Finally, we conduct a regional performance evaluation to demonstrate the effectiveness of our proposed network in various regions.https://www.mdpi.com/1424-8220/24/4/1168autonomous vehiclecomputer visiondata augmentationfeature fusionmultispectralpedestrian detection |
spellingShingle | Sangin Lee Taejoo Kim Jeongmin Shin Namil Kim Yukyung Choi INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection Sensors autonomous vehicle computer vision data augmentation feature fusion multispectral pedestrian detection |
title | INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection |
title_full | INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection |
title_fullStr | INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection |
title_full_unstemmed | INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection |
title_short | INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection |
title_sort | insanet intra inter spectral attention network for effective feature fusion of multispectral pedestrian detection |
topic | autonomous vehicle computer vision data augmentation feature fusion multispectral pedestrian detection |
url | https://www.mdpi.com/1424-8220/24/4/1168 |
work_keys_str_mv | AT sanginlee insanetintrainterspectralattentionnetworkforeffectivefeaturefusionofmultispectralpedestriandetection AT taejookim insanetintrainterspectralattentionnetworkforeffectivefeaturefusionofmultispectralpedestriandetection AT jeongminshin insanetintrainterspectralattentionnetworkforeffectivefeaturefusionofmultispectralpedestriandetection AT namilkim insanetintrainterspectralattentionnetworkforeffectivefeaturefusionofmultispectralpedestriandetection AT yukyungchoi insanetintrainterspectralattentionnetworkforeffectivefeaturefusionofmultispectralpedestriandetection |