2021
DOI: 10.3390/s21030916
|View full text |Cite
|
Sign up to set email alerts
|

Asymmetric Adaptive Fusion in a Two-Stream Network for RGB-D Human Detection

Abstract: In recent years, human detection in indoor scenes has been widely applied in smart buildings and smart security, but many related challenges can still be difficult to address, such as frequent occlusion, low illumination and multiple poses. This paper proposes an asymmetric adaptive fusion two-stream network (AAFTS-net) for RGB-D human detection. This network can fully extract person-specific depth features and RGB features while reducing the typical complexity of a two-stream network. A depth feature pyramid … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 38 publications
0
4
0
Order By: Relevance
“…In response to the second category of questions, Zhang et al [25,26] used a visualization tool to transform the output of each network layer into a heat map image. The feature extraction capability of the network layer was then judged by visual observation of the extracted contours of the foreground targets in the output heat map of each network layer.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In response to the second category of questions, Zhang et al [25,26] used a visualization tool to transform the output of each network layer into a heat map image. The feature extraction capability of the network layer was then judged by visual observation of the extracted contours of the foreground targets in the output heat map of each network layer.…”
Section: Related Workmentioning
confidence: 99%
“…[23,24] used a scaling factor of the Batch Normalization function to calculate the contribution of the network structure to the final results. The second type of pruning method focuses on the subjective assessment of good or bad network structures through visual images [25][26][27]. Pruning the network structures that contribute less.…”
Section: Introductionmentioning
confidence: 99%
“…Several methods based on two-modal fusion are used [46][47][48] to demonstrate the advantages of crowd counting in terms of day and night illumination, occlusion, and scale transformation by obtaining fused features. Twostream models [49,50] are proposed to fuse hierarchical cross-modal features to achieve fully representative shared features. In addition, there are methods [51] that explore the use of shared branches to map shared information into a common feature space.…”
Section: Related Workmentioning
confidence: 99%
“…For RGB-D based human detection the multi-glimpse LSTM in [5] and an asymmetric adaptive fusion two-stream network (AAFTS-net, [6]) were proposed. For real-time people detection in top-view depth images from video surveillance systems, WatchNet and its extension WatchNet++ were presented in [7] and [8].…”
Section: Introductionmentioning
confidence: 99%