Inception Convolution and Feature Fusion for Person Search

With the rapid advancement of deep learning theory and hardware device computing capacity, computer vision tasks, such as object detection and instance segmentation, have entered a revolutionary phase in recent years. As a result, extremely challenging integrated tasks, such as person search, might...

Full description

Bibliographic Details
Main Authors:	Huan Ouyang, Jiexian Zeng, Lu Leng
Format:	Article
Language:	English
Published:	MDPI AG 2023-02-01
Series:	Sensors
Subjects:	person search Faster R-CNN inception convolution feature fusion region proposal network (RPN) double-head
Online Access:	https://www.mdpi.com/1424-8220/23/4/1984

_version_	1827755518779719680
author	Huan Ouyang Jiexian Zeng Lu Leng
author_facet	Huan Ouyang Jiexian Zeng Lu Leng
author_sort	Huan Ouyang
collection	DOAJ
description	With the rapid advancement of deep learning theory and hardware device computing capacity, computer vision tasks, such as object detection and instance segmentation, have entered a revolutionary phase in recent years. As a result, extremely challenging integrated tasks, such as person search, might develop quickly. The majority of efficient network frameworks, such as Seq-Net, are based on Faster R-CNN. However, because of the parallel structure of Faster R-CNN, the performance of re-ID can be significantly impacted by the single-layer, low resolution, and occasionally overlooked check feature diagrams retrieved during pedestrian detection. To address these issues, this paper proposed a person search methodology based on an inception convolution and feature fusion module (IC-FFM) using Seq-Net (Sequential End-to-end Network) as the benchmark. First, we replaced the general convolution in ResNet-50 with the new inception convolution module (ICM), allowing the convolution operation to effectively and dynamically distribute various channels. Then, to improve the accuracy of information extraction, the feature fusion module (FFM) was created to combine multi-level information using various levels of convolution. Finally, Bounding Box regression was created using convolution and the double-head module (DHM), which considerably enhanced the accuracy of pedestrian retrieval by combining global and fine-grained information. Experiments on CHUK-SYSU and PRW datasets showed that our method has higher accuracy than Seq-Net. In addition, our method is simpler and can be easily integrated into existing two-stage frameworks.
first_indexed	2024-03-11T08:10:29Z
format	Article
id	doaj.art-0f58c3a335e740ac9ba89458205a8bd6
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-11T08:10:29Z
publishDate	2023-02-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-0f58c3a335e740ac9ba89458205a8bd62023-11-16T23:08:44ZengMDPI AGSensors1424-82202023-02-01234198410.3390/s23041984Inception Convolution and Feature Fusion for Person SearchHuan Ouyang0Jiexian Zeng1Lu Leng2School of Software, Nanchang Hangkong University, Nanchang 330063, ChinaSchool of Software, Nanchang Hangkong University, Nanchang 330063, ChinaSchool of Software, Nanchang Hangkong University, Nanchang 330063, ChinaWith the rapid advancement of deep learning theory and hardware device computing capacity, computer vision tasks, such as object detection and instance segmentation, have entered a revolutionary phase in recent years. As a result, extremely challenging integrated tasks, such as person search, might develop quickly. The majority of efficient network frameworks, such as Seq-Net, are based on Faster R-CNN. However, because of the parallel structure of Faster R-CNN, the performance of re-ID can be significantly impacted by the single-layer, low resolution, and occasionally overlooked check feature diagrams retrieved during pedestrian detection. To address these issues, this paper proposed a person search methodology based on an inception convolution and feature fusion module (IC-FFM) using Seq-Net (Sequential End-to-end Network) as the benchmark. First, we replaced the general convolution in ResNet-50 with the new inception convolution module (ICM), allowing the convolution operation to effectively and dynamically distribute various channels. Then, to improve the accuracy of information extraction, the feature fusion module (FFM) was created to combine multi-level information using various levels of convolution. Finally, Bounding Box regression was created using convolution and the double-head module (DHM), which considerably enhanced the accuracy of pedestrian retrieval by combining global and fine-grained information. Experiments on CHUK-SYSU and PRW datasets showed that our method has higher accuracy than Seq-Net. In addition, our method is simpler and can be easily integrated into existing two-stage frameworks.https://www.mdpi.com/1424-8220/23/4/1984person searchFaster R-CNNinception convolutionfeature fusionregion proposal network (RPN)double-head
spellingShingle	Huan Ouyang Jiexian Zeng Lu Leng Inception Convolution and Feature Fusion for Person Search Sensors person search Faster R-CNN inception convolution feature fusion region proposal network (RPN) double-head
title	Inception Convolution and Feature Fusion for Person Search
title_full	Inception Convolution and Feature Fusion for Person Search
title_fullStr	Inception Convolution and Feature Fusion for Person Search
title_full_unstemmed	Inception Convolution and Feature Fusion for Person Search
title_short	Inception Convolution and Feature Fusion for Person Search
title_sort	inception convolution and feature fusion for person search
topic	person search Faster R-CNN inception convolution feature fusion region proposal network (RPN) double-head
url	https://www.mdpi.com/1424-8220/23/4/1984
work_keys_str_mv	AT huanouyang inceptionconvolutionandfeaturefusionforpersonsearch AT jiexianzeng inceptionconvolutionandfeaturefusionforpersonsearch AT luleng inceptionconvolutionandfeaturefusionforpersonsearch

Inception Convolution and Feature Fusion for Person Search

Similar Items