ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9053070
|View full text |Cite
|
Sign up to set email alerts
|

Stacked Pooling for Boosting Scale Invariance of Crowd Counting

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(7 citation statements)
references
References 20 publications
0
7
0
Order By: Relevance
“…In order to show that the multiscale fusion module can effectively combine the information extracted from each branch in different densities we partition the images in the dataset into 5 different crowd density groups based on the number of people in each image, as in [18]. Consequently, Group 1 comprises the first 20 th percentile of density, Group 2 the second 20 th percentile, etc.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…In order to show that the multiscale fusion module can effectively combine the information extracted from each branch in different densities we partition the images in the dataset into 5 different crowd density groups based on the number of people in each image, as in [18]. Consequently, Group 1 comprises the first 20 th percentile of density, Group 2 the second 20 th percentile, etc.…”
Section: Resultsmentioning
confidence: 99%
“…Datasets. We use two popular datasets, ShanghaiTech A and ShanghaiTech B [9], which are the most commonly used datasets in the area of crowd counting [8,10,2,11,20,18]. ShanghaiTech A contains 482 images with 241,677 total annotated heads.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Despite the high coverage of surveillance cameras, human activity detections still rampage in places with a less sufficient police force, even directly under high-resolution cameras. Now, as the field of Artificial Intelligence and Computer Vision develops, new solutions such as Biometric Identification [3,4,5], Object Detection/Tracking [6,7,8], Crowd Density Analysis [9,10,11,12,13], and Action Recognition [14,15,16,17,18,19] have come to light, which in theory could automatically detect objects or actions. However, this seemingly useful technology remains confined to the laboratories as a consequence of deficiencies: 1) The narrow scope of the detection process (e.g., action-only or object-only) cripples the model accuracy, for it requires a comprehensive consideration of multiple factors to determine the nature of a situation.…”
Section: Introductionmentioning
confidence: 99%
“…While multiple pooling strategies have been proposed ( Graham, 2014 ; Gulcehre et al , 2014 ; He et al , 2014 ; Huang et al , 2018 ; Lee et al , 2016 ; Lu et al , 2015 ; Xie et al , 2015 ; Zeiler and Fergus, 2013 ; Zhai et al , 2017 ), max pooling and average pooling are popularly utilized in practical models ( Boureau et al , 2008 ; Jarrett et al , 2009 ; LeCun et al , 1990 , 1998 ). Max pooling is done by applying a max filter to subregions of the initial representation and global pooling utilizes an average filter.…”
Section: Introductionmentioning
confidence: 99%