A Part-Aware Multi-Scale Fully Convolutional Network for Pedestrian Detection

Yang, Peiyu; Zhang, Guofeng; Wang, Lu; Xu, Lisong; Deng, Qingxu; Yang, Ming–Hsuan

doi:10.1109/tits.2019.2963700

Cited by 52 publications

(31 citation statements)

References 43 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The aforementioned feature fusion structures play a great role in generic object detection. Some works like [67], [102], [104], [113], [114] borrow from these ideas and propose some new fusion strategies to adapt to pedestrian detection. Some typical frameworks are shown in Figure 13.…”

Section: A Leverage Multi-scale Feature Fusionmentioning

confidence: 99%

Occlusion Handling and Multi-Scale Pedestrian Detection Based on Deep Learning: A Review

Fang

Liu

et al. 2022

IEEE Access

View full text Add to dashboard Cite

Pedestrian detection is an important branch of computer vision, and it has important applications in the fields of autonomous driving, artificial intelligence and video surveillance. With the rapid development of deep learning and the proposal of large-scale datasets, pedestrian detection has reached a new stage and achieves better performance. However, the performance of state-of-the-art methods is far behind the expectation, especially when occlusion and scale variance exist. Therefore, a lot of works focused on occlusion and scale variance have been proposed in the past few years. The purpose of this article is to make a detailed review of recent progress in pedestrian detection. Firstly, brief progress of pedestrian detection in the past two decades is summarized. Secondly, recent deep learning methods focusing on occlusion and scale variance are analyzed. Moreover, the popular datasets and evaluation methods for pedestrian detection are introduced. Finally, the development trend of pedestrian detection is prospected.

show abstract

Section: A Leverage Multi-scale Feature Fusionmentioning

confidence: 99%

Occlusion Handling and Multi-Scale Pedestrian Detection Based on Deep Learning: A Review

Fang

Liu

et al. 2022

IEEE Access

View full text Add to dashboard Cite

show abstract

“…Repulsion loss [ 16 ], for example, works by setting the loss function forecasting responsible for the distance of the frame of objects, and, together with the surroundings, is not the actual target box (the box that contains natural objects and predict boxes) of space used to improve the model performance. Yang proposed a partially sensing multi-scale fully convolutional network to solve these occlusion and large-scale problems [ 17 ]. The most responsive part is selected by voting, and partially visible pedestrian instances can obtain a high detection confidence value, making it unlikely to miss detection.…”

Section: Realted Workmentioning

confidence: 99%

Pedestrian Detection with Multi-View Convolution Fusion Algorithm

Liu

Han

Zhang

et al. 2022

Entropy

View full text Add to dashboard Cite

In recent years, the pedestrian detection technology of a single 2D image has been dramatically improved. When the scene becomes very crowded, the detection performance will deteriorate seriously and cannot meet the requirements of autonomous driving perception. With the introduction of the multi-view method, the task of pedestrian detection in crowded or fuzzy scenes has been significantly improved and has become a widely used method in autonomous driving. In this paper, we construct a double-branch feature fusion structure, the first branch adopts a lightweight structure, the second branch further extracts features and gets the feature map obtained from each layer. At the same time, the receptive field is enlarged by expanding convolution. To improve the speed of the model, the keypoint is used instead of the entire object for regression without an NMS post-processing operation. Meanwhile, the whole model can be learned from end to end. Even in the presence of many people, the method can still perform better on accuracy and speed. In the standard of Wildtrack and MultiviewX dataset, the accuracy and running speed both perform better than the state-of-the-art model, which has great practical significance in the autonomous driving field.

show abstract

“…However, people can suffer from occlusion as well as variations in illumination, scale, and background, which make human detection in indoor scenes a challenging task. Methods based only on RGB features [ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 ] can no longer meet the needs of human detection in many scenarios. With the popularization of inexpensive depth acquisition equipment, detecting human with the help of depth information has become an effective and feasible scheme.…”

Section: Introductionmentioning

confidence: 99%

“…To address the challenges of occlusion and scale changes in RGB images, several pedestrian detection algorithms [ 7 , 8 , 9 , 10 , 11 , 12 , 13 ] have been developed based on novel processing approaches. Andre et al [ 7 ] proposed a cascaded aggregate channel features (ACF) detector to accurately detect humans.…”

Section: Introductionmentioning

confidence: 99%

“…Wang et al [ 10 ] introduced a RepLoss loss function to detect humans. Yang et al [ 11 ] designed a part-aware region-of interest (RoI) pooling module to mine body parts with different responses. To handle varying levels of people occlusions, Xie et al [ 12 ] used a graph convolutional network (GCN) to explicitly capture both inter- and intrapart co-occurrence information of different human body parts.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Asymmetric Adaptive Fusion in a Two-Stream Network for RGB-D Human Detection

Zhang

Guo

Wang

et al. 2021

Sensors

View full text Add to dashboard Cite

In recent years, human detection in indoor scenes has been widely applied in smart buildings and smart security, but many related challenges can still be difficult to address, such as frequent occlusion, low illumination and multiple poses. This paper proposes an asymmetric adaptive fusion two-stream network (AAFTS-net) for RGB-D human detection. This network can fully extract person-specific depth features and RGB features while reducing the typical complexity of a two-stream network. A depth feature pyramid is constructed by combining contextual information, with the motivation of combining multiscale depth features to improve the adaptability for targets of different sizes. An adaptive channel weighting (ACW) module weights the RGB-D feature channels to achieve efficient feature selection and information complementation. This paper also introduces a novel RGB-D dataset for human detection called RGBD-human, on which we verify the performance of the proposed algorithm. The experimental results show that AAFTS-net outperforms existing state-of-the-art methods and can maintain stable performance under conditions of frequent occlusion, low illumination and multiple poses.

show abstract

A Part-Aware Multi-Scale Fully Convolutional Network for Pedestrian Detection

Cited by 52 publications

References 43 publications

Occlusion Handling and Multi-Scale Pedestrian Detection Based on Deep Learning: A Review

Occlusion Handling and Multi-Scale Pedestrian Detection Based on Deep Learning: A Review

Pedestrian Detection with Multi-View Convolution Fusion Algorithm

Asymmetric Adaptive Fusion in a Two-Stream Network for RGB-D Human Detection

Contact Info

Product

Resources

About