2022
DOI: 10.1109/tim.2022.3160534
|View full text |Cite
|
Sign up to set email alerts
|

We Know Where They Are Looking at From the RGB-D Camera: Gaze Following in 3D

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 55 publications
0
6
0
Order By: Relevance
“…Solution for conflicted demonstrations: To resolve the conflicted labels from different experts, a function for conflict resolution is executed after each rollout j before the incorporation of D j into D. The conflict resolution takes D j , D, and σ t as inputs. We use cosine similarity to identify and select similar observations due to its wide applications in similarity detection for various sensors that are commonly seen in robotics, including LiDAR scans [33,34] and RGBD camera [35,36]. To efficiently leverage parallel processing, the cosine similarities Θ between all observations O j in D j and all observations O in D is calculated by the dot product of O and O j divided by element-wise multiplication of the Euclidean norms of O and O j :…”
Section: Methodsmentioning
confidence: 99%
“…Solution for conflicted demonstrations: To resolve the conflicted labels from different experts, a function for conflict resolution is executed after each rollout j before the incorporation of D j into D. The conflict resolution takes D j , D, and σ t as inputs. We use cosine similarity to identify and select similar observations due to its wide applications in similarity detection for various sensors that are commonly seen in robotics, including LiDAR scans [33,34] and RGBD camera [35,36]. To efficiently leverage parallel processing, the cosine similarities Θ between all observations O j in D j and all observations O in D is calculated by the dot product of O and O j divided by element-wise multiplication of the Euclidean norms of O and O j :…”
Section: Methodsmentioning
confidence: 99%
“…Although Deep Neural Networks (DNNs) have been implemented in visual trackers such as Caelles et al ( 2017 ), Wang et al ( 2019 ), Lukezic et al ( 2020 ), and Koide et al ( 2020 ) that produce high frames per second (FPS) ROI trackers, these systems may not adequately track the target during substantial pose variations. There have been efforts to create a more robust representation of a person's position in 3D space using plane and height estimation techniques (Chou and Nakajima, 2017 ; Jiang et al, 2018 ; Zou and Lan, 2019 ; Hu et al, 2022 ). Still, they predominantly yield an approximation of the person's position.…”
Section: Related Workmentioning
confidence: 99%
“…To augment the precision and responsiveness of these robotic applications, the present research introduces a novel end-to-end Deep Neural Network (DNN) for person-following. This approach enables pixel-level tracking of the target individual and integrates a real-time future motion estimation function, facilitating the robot's capacity to anticipate and swiftly react to the individual's movements (Lin et al, 2012 ; Cosgun et al, 2013 ; Cheng et al, 2019 ; Koide et al, 2020 ; Hu et al, 2022 ).…”
Section: Introductionmentioning
confidence: 99%
“…Fang et al [10] used depth to potentially disambiguate attention targets by inferring whether a person is looking toward their foreground or background, obtaining the best results reported so far on the GazeFollow and VideoAtten-tionTarget datasets. Recently, Hu et al [15] used depth information to perform gaze target prediction in 3D. As far as we are aware, ours is the first work to use both pose and depth information, and to study the privacy preserving situation.…”
Section: Gaze Target Predictionmentioning
confidence: 99%