This paper proposes a novel entropy-weighted Gabor-phase congruency (EWGP) feature descriptor for head-pose estimation on the basis of feature fusion. Gabor features are robust and invariant to differences in orientation and illuminance but are not sufficient to express the amplitude character in images. By contrast, phase congruency (PC) functions work well in amplitude expression. Both illuminance and amplitude vary over distinctive regions. Here, we employ entropy information to evaluate orientation and amplitude to execute feature fusion. More specifically, entropy is used to represent the randomness and content of information. For the first time, we seek to utilize entropy as weight information to fuse the Gabor and phase matrices in every region. The proposed EWGP feature matrix was verified on Pointing'04 and FacePix. The experimental results demonstrate that our method is superior to the state of the art in terms of MSE, MAE, and time cost.Keywords: EWGP, Head-pose estimation, Entropy weighted, Gabor, Phase congruency, Feature fusion 1 Review
IntroductionVisual focus of attention (VFoA) is emphasized to estimate at what or whom a person is looking and is highly correlated with head-pose estimation [1]. To study headpose estimation, three-dimensional orientation parameters from human head images are explored. Head poses convey an abundance of information in natural interpersonal communication (NIC) and human-computer interaction (HCI) [2]; therefore, an increasing number of researchers is seeking more effective and robust methodologies to estimate head pose. Head poses also play a critical role in artificial intelligence (AI) applications and reveal considerable latent significance of personal intent. For example, people nod their heads to represent understanding in conversations and shake their heads to show dissent, confusion, or consideration. Head orientation with a specific finger-pointing direction generally indicates the place that a person wants to go. The combination of head pose and hand gestures is used to assess the target of an individual's interest [3]. Mutual orientation indicates that people are involved in discussion. If a person shifts the head toward a specific direction, it is highly likely that there is an object of interest in this direction. Therefore, the study of VFoA as an indicator of conversation target in human-computer interaction and facial-expression recognition is increasingly of interest.Analyzing head poses is a natural capability of humans but is difficult for AI. However, head-pose estimation has been researched for years, and the state of the art in headpose estimation can contribute greatly to bridging the gap between humans and AI [4,5]. Head-pose estimation is generally interpreted as the capability to infer orientation relative to the observation camera. For example, head pose is exploited to determine the focus point on the screen based on the gaze direction [6]. The factors influencing the estimation of head pose and their relationships have been introdu...