2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2018
DOI: 10.1109/cvpr.2018.00564
|View full text |Cite
|
Sign up to set email alerts
|

Crowd Counting with Deep Negative Correlation Learning

Abstract: Deep convolutional networks (ConvNets) have achieved unprecedented performances on many computer vision tasks. However, their adaptations to crowd counting on single images are still in their infancy and suffer from severe over-fitting. Here we propose a new learning strategy to produce generalizable features by way of deep negative correlation learning (NCL). More specifically, we deeply learn a pool of decorrelated regressors with sound generalization capabilities through managing their intrinsic diversities… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
161
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 274 publications
(164 citation statements)
references
References 28 publications
0
161
0
Order By: Relevance
“…In the past decade, a number of crowd counting algorithms [22,58,21,12,5,35,20,11,7,31] have been proposed in the literature. Recently, crowd counting methods using Convolutional Neural Networks (CNNs) have made remarkable progresses [53,46,36,6,57,59,38,9,4,32,16]. The best performing methods are mostly based on the density map estimation, which typically obtain the crowd count by predicting a density map for the input image and then summing over the estimated density map.…”
Section: Introductionmentioning
confidence: 99%
“…In the past decade, a number of crowd counting algorithms [22,58,21,12,5,35,20,11,7,31] have been proposed in the literature. Recently, crowd counting methods using Convolutional Neural Networks (CNNs) have made remarkable progresses [53,46,36,6,57,59,38,9,4,32,16]. The best performing methods are mostly based on the density map estimation, which typically obtain the crowd count by predicting a density map for the input image and then summing over the estimated density map.…”
Section: Introductionmentioning
confidence: 99%
“…The proposed method also performs slightly better than L2R [18] in transferring mod-els trained on ShanghaiTech Part A to UCF CC 50. Yet, the improvement is not as significant as the comparison with [33,25] on transferring between ShanghaiTech Part A and Part B. This is probably because L2R [18] also relies on extra data which may somehow help to reduce the gap between the two datasets.…”
Section: Evaluation Of Transferabilitymentioning
confidence: 91%
“…To overcome the over-fitting, Liu et al [18] propose a learning-torank framework to leverage abundantly available unlabeled crowd images and a self-learning strategy. Shi et al [25] build a set of decorrelated regressors with reasonable generalization capabilities through managing their intrinsic diversities to avoid severe over-fitting. Though many methods have been proposed to tackle the large scale and density variation issue, this problem still remains challenging for crowd counting.…”
Section: Related Workmentioning
confidence: 99%
“…Due to these issues, crowd counting and density estimation is a very difficult problem, especially in highly congested scenes. Several recent convolutional neural network (CNN) based methods for counting [1,2,3,4,5,6,7,8,9,10,11,12] have attempted to address one or more of these issues by adding more robustness to scale variations by proposing different techniques such as multicolumn networks [2], intelligent selection of regressors suited for a particular crowd scenario [3] and incorporating global, local context information into the counting network [4], etc. Methods such as [3,4,8] achieve significantly lower errors compared to the earlier approaches, however, they are complex to train due to the presence of multiple learning stages.…”
Section: Introductionmentioning
confidence: 99%
“…Similarly, CP-CNN [4] requires that their local and global estimators to be trained separately, followed by end-toend training of their density map estimator. Although the most recent methods such as [6,7,10] achieve better results while being efficient, there is still considerable room for further improvements. In this paper, we propose to improve the counting performance by explicitly modeling spatial pixel-wise attention and global attention into the counting network.…”
Section: Introductionmentioning
confidence: 99%