2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00625
|View full text |Cite
|
Sign up to set email alerts
|

Learning Spatial Awareness to Improve Crowd Counting

Abstract: The aim of crowd counting is to estimate the number of people in images by leveraging the annotation of center positions for pedestrians' heads. Promising progresses have been made with the prevalence of deep Convolutional Neural Networks. Existing methods widely employ the Euclidean distance (i.e., L 2 loss) to optimize the model, which, however, has two main drawbacks: (1) the loss has difficulty in learning the spatial awareness (i.e., the position of head) since it struggles to retain the high-frequency va… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
58
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
5
5

Relationship

1
9

Authors

Journals

citations
Cited by 138 publications
(58 citation statements)
references
References 47 publications
(83 reference statements)
0
58
0
Order By: Relevance
“…Update DIVi as per Eq. (4) ; 9 Integrate over DIVN to obtain the image count C; 10 return C C 0 denotes the local count of a 64 × 64 region, the sum of C 0 should equal to C 0 . The upsampling of C 0 is therefore a re-distribution operator that assigns C 0 to each sub-region.…”
Section: Single-stage Spatial Divide-and-conquermentioning
confidence: 99%
“…Update DIVi as per Eq. (4) ; 9 Integrate over DIVN to obtain the image count C; 10 return C C 0 denotes the local count of a 64 × 64 region, the sum of C 0 should equal to C 0 . The upsampling of C 0 is therefore a re-distribution operator that assigns C 0 to each sub-region.…”
Section: Single-stage Spatial Divide-and-conquermentioning
confidence: 99%
“…We follow the previous works [7,26,34,38,52] to use the mean absolute error (MAE) and the root mean square error (MSE) to evaluate all the methods. Assume is the count of the ℎ image, is the corresponding groundtruth,…”
Section: Evaluation Metricsmentioning
confidence: 99%
“…The integral of the density map gives the crowd count in the image [12]. Researches in recent trends focused on designing more powerful DNN structures and exploiting more effective learning paradigms [2,4,14,18,27,31,34]. For instance, Guo et al [4] designed multi-rate dilated convolutions to capture rich spatial context at different scales of density maps; Liu et al [18] introduced an improved dilated multi-scale structure similarity (DMS-SSIM) loss to learn density maps with local consistency; Xu et al [37] [17] and they are not capable of providing individual locations in the crowds, which, on the other hand, are believed to be the merits of detectionbased crowd counting methods, as specified below.…”
Section: Regression-based Crowd Countingmentioning
confidence: 99%