INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection

Pedestrian detection is a critical task for safety-critical systems, but detecting pedestrians is challenging in low-light and adverse weather conditions. Thermal images can be used to improve robustness by providing complementary information to RGB images. Previous studies have shown that multi-mod...

Full description

Bibliographic Details
Main Authors: Sangin Lee, Taejoo Kim, Jeongmin Shin, Namil Kim, Yukyung Choi
Format: Article
Language:English
Published: MDPI AG 2024-02-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/24/4/1168
_version_ 1797297019798159360
author Sangin Lee
Taejoo Kim
Jeongmin Shin
Namil Kim
Yukyung Choi
author_facet Sangin Lee
Taejoo Kim
Jeongmin Shin
Namil Kim
Yukyung Choi
author_sort Sangin Lee
collection DOAJ
description Pedestrian detection is a critical task for safety-critical systems, but detecting pedestrians is challenging in low-light and adverse weather conditions. Thermal images can be used to improve robustness by providing complementary information to RGB images. Previous studies have shown that multi-modal feature fusion using convolution operation can be effective, but such methods rely solely on local feature correlations, which can degrade the performance capabilities. To address this issue, we propose an attention-based novel fusion network, referred to as INSANet (INtra-INter Spectral Attention Network), that captures global intra- and inter-information. It consists of intra- and inter-spectral attention blocks that allow the model to learn mutual spectral relationships. Additionally, we identified an imbalance in the multispectral dataset caused by several factors and designed an augmentation strategy that mitigates concentrated distributions and enables the model to learn the diverse locations of pedestrians. Extensive experiments demonstrate the effectiveness of the proposed methods, which achieve state-of-the-art performance on the KAIST dataset and LLVIP dataset. Finally, we conduct a regional performance evaluation to demonstrate the effectiveness of our proposed network in various regions.
first_indexed 2024-03-07T22:15:10Z
format Article
id doaj.art-a007b7c6c73048daaf5822544209444c
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-07T22:15:10Z
publishDate 2024-02-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-a007b7c6c73048daaf5822544209444c2024-02-23T15:33:45ZengMDPI AGSensors1424-82202024-02-01244116810.3390/s24041168INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian DetectionSangin Lee0Taejoo Kim1Jeongmin Shin2Namil Kim3Yukyung Choi4Department of Software, Sejong University, Seoul 05006, Republic of KoreaDepartment of Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, Republic of KoreaDepartment of Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, Republic of KoreaNAVER LABS, Seongnam 13561, Republic of KoreaDepartment of Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, Republic of KoreaPedestrian detection is a critical task for safety-critical systems, but detecting pedestrians is challenging in low-light and adverse weather conditions. Thermal images can be used to improve robustness by providing complementary information to RGB images. Previous studies have shown that multi-modal feature fusion using convolution operation can be effective, but such methods rely solely on local feature correlations, which can degrade the performance capabilities. To address this issue, we propose an attention-based novel fusion network, referred to as INSANet (INtra-INter Spectral Attention Network), that captures global intra- and inter-information. It consists of intra- and inter-spectral attention blocks that allow the model to learn mutual spectral relationships. Additionally, we identified an imbalance in the multispectral dataset caused by several factors and designed an augmentation strategy that mitigates concentrated distributions and enables the model to learn the diverse locations of pedestrians. Extensive experiments demonstrate the effectiveness of the proposed methods, which achieve state-of-the-art performance on the KAIST dataset and LLVIP dataset. Finally, we conduct a regional performance evaluation to demonstrate the effectiveness of our proposed network in various regions.https://www.mdpi.com/1424-8220/24/4/1168autonomous vehiclecomputer visiondata augmentationfeature fusionmultispectralpedestrian detection
spellingShingle Sangin Lee
Taejoo Kim
Jeongmin Shin
Namil Kim
Yukyung Choi
INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection
Sensors
autonomous vehicle
computer vision
data augmentation
feature fusion
multispectral
pedestrian detection
title INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection
title_full INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection
title_fullStr INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection
title_full_unstemmed INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection
title_short INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection
title_sort insanet intra inter spectral attention network for effective feature fusion of multispectral pedestrian detection
topic autonomous vehicle
computer vision
data augmentation
feature fusion
multispectral
pedestrian detection
url https://www.mdpi.com/1424-8220/24/4/1168
work_keys_str_mv AT sanginlee insanetintrainterspectralattentionnetworkforeffectivefeaturefusionofmultispectralpedestriandetection
AT taejookim insanetintrainterspectralattentionnetworkforeffectivefeaturefusionofmultispectralpedestriandetection
AT jeongminshin insanetintrainterspectralattentionnetworkforeffectivefeaturefusionofmultispectralpedestriandetection
AT namilkim insanetintrainterspectralattentionnetworkforeffectivefeaturefusionofmultispectralpedestriandetection
AT yukyungchoi insanetintrainterspectralattentionnetworkforeffectivefeaturefusionofmultispectralpedestriandetection