Abstract.In this paper, we propose a method called Convolutional Neural Network-Markov Random Field (CNN-MRF) to estimate the crowd count in a still image. We first divide the dense crowd visible image into overlapping patches and then use a deep convolutional neural network to extract features from each patch image, followed by a fully connected neural network to regress the local patch crowd count. Since the local patches have overlapping portions, the crowd count of the adjacent patches has a high correlation. We use this correlation and the Markov random field to smooth the counting results of the local patches. Experiments show that our approach significantly outperforms the state-of-the-art methods on UCF and Shanghaitech crowd counting datasets. Code available on GitHub https://github.com/hankong/crowdcounting.
Analysis of pedestrians’ motion is important to real-world applications in public scenes. Due to the complex temporal and spatial factors, trajectory prediction is a challenging task. With the development of attention mechanism recently, transformer network has been successfully applied in natural language processing, computer vision, and audio processing. We propose an end-to-end transformer network embedded with random deviation queries for pedestrian trajectory forecasting. The self-correcting scheme can enhance the robustness of the network. Moreover, we present a co-training strategy to improve the training effect. The whole scheme is trained collaboratively by the original loss and classification loss. Therefore, we also achieve more accurate prediction results. Experimental results on several datasets indicate the validity and robustness of the network. We achieve the best performance in individual forecasting and comparable results in social forecasting. Encouragingly, our approach achieves a new state of the art on the Hotel and Zara2 datasets compared with the social-based and individual-based approaches.
Density estimation aims to predict the spatial distribution of a crowd scene, and crowd counting aims to automatically check the number of heads as close as the ground truth. We propose a mask guided GAN (Generative Adversarial Network) architecture to solve these two problems synthetically. Step one is generating a segmentation mask, separating the crowd region from the background and redundant information. Step two is predicting the density map with an adversarial learning process guided by the former mask information. Moreover, we branch out from the base network and feed into a counting regressor, which is solely to provide more accurate counting results. The whole scheme is trained collaboratively by compositing density loss (a weighted loss to balance the influence of different data) and counting loss (an MSE loss for counting regression branch). Through the mask information guidance, GAN likely finds its way training to capture more distinguishing features. Therefore, we also achieve more accurate prediction results. Experimental results on different datasets including collected from the internet and the actual scene in Shanghai, the second-most populous city located on China's southeast coast, indicate the validity and robustness by comparable counting numbers and high-quality density maps focusing on crowd area. INDEX TERMS Adversarial learning, mask guided network, density estimation, crowd counting.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.