2019
DOI: 10.1007/978-3-030-22999-3_12
|View full text |Cite
|
Sign up to set email alerts
|

Towards Real-Time Head Pose Estimation: Exploring Parameter-Reduced Residual Networks on In-the-wild Datasets

Abstract: Head poses are a key component of human bodily communication and thus a decisive element of human-computer interaction. Real-time head pose estimation is crucial in the context of human-robot interaction or driver assistance systems. The most promising approaches for head pose estimation are based on Convolutional Neural Networks (CNNs). However, CNN models are often too complex to achieve realtime performance. To face this challenge, we explore a popular subgroup of CNNs, the Residual Networks (ResNets) and m… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 27 publications
0
6
0
Order By: Relevance
“…For example, Hui et al [34] proposed a very compact LiteFlowNet which is 30 times smaller in the model size and 1.36 times faster in the running speed in comparison with the state-of-the-art CNNs for optical flow estimation. In [35], Rieger et al explored parameter-reduced residual networks on in-the-wild datasets, targeting real-time head pose estimation. They experimented with various ResNet architectures with a varying number of layers to handle different image sizes (including low-resolution images).…”
Section: Methodsmentioning
confidence: 99%
“…For example, Hui et al [34] proposed a very compact LiteFlowNet which is 30 times smaller in the model size and 1.36 times faster in the running speed in comparison with the state-of-the-art CNNs for optical flow estimation. In [35], Rieger et al explored parameter-reduced residual networks on in-the-wild datasets, targeting real-time head pose estimation. They experimented with various ResNet architectures with a varying number of layers to handle different image sizes (including low-resolution images).…”
Section: Methodsmentioning
confidence: 99%
“…Our backbone is a parameter-reduced 18-layer ResNet (ResNet18) [20] based on He et al [6]. Small changes were made in the output layer which uses a sigmoid function and in the activation functions, which are ReLU units.…”
Section: B Trainingmentioning
confidence: 99%
“…Our backbone is a parameter-reduced 18-layer ResNet with input images of size 112x112 pixels (ResNet18-112) [25] (Fig. 1b).…”
Section: Trainingmentioning
confidence: 99%