Domain adaptive crowd counting via dynamic scale aggregation network

Abstract Crowd counting is an important research topic in computer vision. Its goal is to estimate the people's number in an image. Researchers have dramatically improved counting accuracy in recent years by regressing density maps. However, because of the inherent domain shift, the model train...

Full description

Bibliographic Details
Main Authors: Zhanqiang Huo, Yanan Wang, Yingxu Qiao, Jing Wang, Fen Luo
Format: Article
Language:English
Published: Wiley 2023-10-01
Series:IET Computer Vision
Subjects:
Online Access:https://doi.org/10.1049/cvi2.12198
_version_ 1827796496260530176
author Zhanqiang Huo
Yanan Wang
Yingxu Qiao
Jing Wang
Fen Luo
author_facet Zhanqiang Huo
Yanan Wang
Yingxu Qiao
Jing Wang
Fen Luo
author_sort Zhanqiang Huo
collection DOAJ
description Abstract Crowd counting is an important research topic in computer vision. Its goal is to estimate the people's number in an image. Researchers have dramatically improved counting accuracy in recent years by regressing density maps. However, because of the inherent domain shift, the model trained on an expensive manually labelled dataset (source domain) does not perform well on a dataset with scarce labels (target domain). For this issue, a novel dynamic scale aggregation network (DSANet) is proposed to reduce the gaps in style and cross‐domain head scale variations. Specifically, a practical style transfer layer is introduced to reduce the appearance discrepancy between the source and target domains. Then, the translated source and target domain samples are encoded by a generator consisting of the VGG16 network and the dynamic scale aggregation modules (DSA Modules) and produce corresponding density maps. The DSA module can adaptively adjust parameters according to the input features and effectively fuse multi‐scale information to overcome the cross‐domain head scale variations. Next, a discriminator judges the input density map from the source or target domain. Last, domain distributions are aligned through adversarial between the generator and the discriminator. The experiments show that our network outperforms the current state‐of‐the‐art methods and can improve the target domain's performance while maintaining the source domain's performance without significant degradation.
first_indexed 2024-03-11T19:07:57Z
format Article
id doaj.art-20fefc2314df4618afc4ebcbe53db88c
institution Directory Open Access Journal
issn 1751-9632
1751-9640
language English
last_indexed 2024-03-11T19:07:57Z
publishDate 2023-10-01
publisher Wiley
record_format Article
series IET Computer Vision
spelling doaj.art-20fefc2314df4618afc4ebcbe53db88c2023-10-10T04:15:41ZengWileyIET Computer Vision1751-96321751-96402023-10-0117781482810.1049/cvi2.12198Domain adaptive crowd counting via dynamic scale aggregation networkZhanqiang Huo0Yanan Wang1Yingxu Qiao2Jing Wang3Fen Luo4School of Software Henan Polytechnic University Jiaozuo ChinaSchool of Software Henan Polytechnic University Jiaozuo ChinaCollege of Computer Science and Technology Henan Polytechnic University Jiaozuo ChinaSchool of Software Henan Polytechnic University Jiaozuo ChinaSchool of Software Henan Polytechnic University Jiaozuo ChinaAbstract Crowd counting is an important research topic in computer vision. Its goal is to estimate the people's number in an image. Researchers have dramatically improved counting accuracy in recent years by regressing density maps. However, because of the inherent domain shift, the model trained on an expensive manually labelled dataset (source domain) does not perform well on a dataset with scarce labels (target domain). For this issue, a novel dynamic scale aggregation network (DSANet) is proposed to reduce the gaps in style and cross‐domain head scale variations. Specifically, a practical style transfer layer is introduced to reduce the appearance discrepancy between the source and target domains. Then, the translated source and target domain samples are encoded by a generator consisting of the VGG16 network and the dynamic scale aggregation modules (DSA Modules) and produce corresponding density maps. The DSA module can adaptively adjust parameters according to the input features and effectively fuse multi‐scale information to overcome the cross‐domain head scale variations. Next, a discriminator judges the input density map from the source or target domain. Last, domain distributions are aligned through adversarial between the generator and the discriminator. The experiments show that our network outperforms the current state‐of‐the‐art methods and can improve the target domain's performance while maintaining the source domain's performance without significant degradation.https://doi.org/10.1049/cvi2.12198computer visionimage processing
spellingShingle Zhanqiang Huo
Yanan Wang
Yingxu Qiao
Jing Wang
Fen Luo
Domain adaptive crowd counting via dynamic scale aggregation network
IET Computer Vision
computer vision
image processing
title Domain adaptive crowd counting via dynamic scale aggregation network
title_full Domain adaptive crowd counting via dynamic scale aggregation network
title_fullStr Domain adaptive crowd counting via dynamic scale aggregation network
title_full_unstemmed Domain adaptive crowd counting via dynamic scale aggregation network
title_short Domain adaptive crowd counting via dynamic scale aggregation network
title_sort domain adaptive crowd counting via dynamic scale aggregation network
topic computer vision
image processing
url https://doi.org/10.1049/cvi2.12198
work_keys_str_mv AT zhanqianghuo domainadaptivecrowdcountingviadynamicscaleaggregationnetwork
AT yananwang domainadaptivecrowdcountingviadynamicscaleaggregationnetwork
AT yingxuqiao domainadaptivecrowdcountingviadynamicscaleaggregationnetwork
AT jingwang domainadaptivecrowdcountingviadynamicscaleaggregationnetwork
AT fenluo domainadaptivecrowdcountingviadynamicscaleaggregationnetwork