2020
DOI: 10.1109/access.2020.2998678
|View full text |Cite
|
Sign up to set email alerts
|

Cascaded Multi-Task Learning of Head Segmentation and Density Regression for RGBD Crowd Counting

Abstract: In this paper we propose a novel regression based RGBD crowd counting method. Compared with previous RGBD crowd counting methods which mainly exploit depth cue to facilitate person/head detection, our approach adopts density map regression and is more robust to severe occlusion under dense crowded scenarios. We develop a cascaded depth-aware counting network that jointly performs head segmentation and density map regression. Our network explicitly feeds depth map at each stage so that depth cues are sufficient… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(2 citation statements)
references
References 50 publications
0
2
0
Order By: Relevance
“…Nevertheless, capturing the complementarities of multimodal data (i.e., RGB and thermal images) is non-trivial. Conventional methods [21,65,37,15,53,45] either feed the combination of multimodal data into deep neural networks or directly fuse their features, which could not well exploit the complementary information. In this work, to facilitate the multimodal crowd counting, we introduce a cross-modal collaborative representation learning framework, which incorporates multiple modality-specific branches, a modality-shared branch, and an Information Aggregation-Distribution Module (IADM) to fully capture the complementarities among different modalities.…”
Section: Introductionmentioning
confidence: 99%
“…Nevertheless, capturing the complementarities of multimodal data (i.e., RGB and thermal images) is non-trivial. Conventional methods [21,65,37,15,53,45] either feed the combination of multimodal data into deep neural networks or directly fuse their features, which could not well exploit the complementary information. In this work, to facilitate the multimodal crowd counting, we introduce a cross-modal collaborative representation learning framework, which incorporates multiple modality-specific branches, a modality-shared branch, and an Information Aggregation-Distribution Module (IADM) to fully capture the complementarities among different modalities.…”
Section: Introductionmentioning
confidence: 99%
“…Existing methods of population counting include detection-based and regression-based methods [15][16][17][18]. Detection-based methods can count population in sparse scenes, but they are limited in crowded scenes.…”
Section: Introductionmentioning
confidence: 99%