2021
DOI: 10.3390/s21124143
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Visual Attention Detection for Mobile Eye Tracking Using Pre-Trained Computer Vision Models and Human Gaze

Abstract: Processing visual stimuli in a scene is essential for the human brain to make situation-aware decisions. These stimuli, which are prevalent subjects of diagnostic eye tracking studies, are commonly encoded as rectangular areas of interest (AOIs) per frame. Because it is a tedious manual annotation task, the automatic detection and annotation of visual attention to AOIs can accelerate and objectify eye tracking research, in particular for mobile eye tracking with egocentric video feeds. In this work, we impleme… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
25
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 24 publications
(26 citation statements)
references
References 79 publications
1
25
0
Order By: Relevance
“…The evaluation between different grids is shown in Tab. 4. "Pixel-level" refers to the evaluation of the saliency map using š· š¾šæ and š¶š¶ metrics.…”
Section: Quantitative Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…The evaluation between different grids is shown in Tab. 4. "Pixel-level" refers to the evaluation of the saliency map using š· š¾šæ and š¶š¶ metrics.…”
Section: Quantitative Resultsmentioning
confidence: 99%
“…Previous works [20,53] set out to reduce tedious labelling by using gazeobject mapping, which annotates objects at the fixation level, i.e., the object being looked at. One popular algorithm checks whether a fixation lies in the object bounding box predicted by deep neural network-based object detector [4,21,29] such as YOLOv4 [5]. Wolf et al [53] suggest to use object segmentation using Mask-RCNN [12] as object area detection.…”
Section: :3mentioning
confidence: 99%
See 2 more Smart Citations
“…Graphical information includes discrete points of focus, whereas; textual information includes continuous points of focus. Based on prior studies [8], [12], [27], [28], [45], we considered fixation duration of 200ms as a threshold to define a fixation event. Intuitively, the fixation counts also increase proportionally with fixation duration.…”
Section: ) Fixation Countmentioning
confidence: 99%