2012
DOI: 10.1007/s12193-012-0111-y
|View full text |Cite
|
Sign up to set email alerts
|

RAVEL: an annotated corpus for training robots with audiovisual abilities

Abstract: We introduce Ravel (Robots with Audiovisual Abilities), a publicly available data set which covers examples of Human Robot Interaction (HRI) scenarios. These scenarios are recorded using the audiovisual robot head POPEYE, equipped with two cameras and four microphones, two of which being plugged into the ears of a dummy head. All the recordings were performed in a standard room with no special equipment, thus providing a challenging indoor scenario. This data set provides a basis to test and benchmark methods … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2012
2012
2024
2024

Publication Types

Select...
3
3
2

Relationship

4
4

Authors

Journals

citations
Cited by 33 publications
(5 citation statements)
references
References 36 publications
0
5
0
Order By: Relevance
“…To evaluate the performance of the proposed binocular method and compare it with a state-of-the-art monocular method [3], we use the Ravel 2 dataset [24]. The Ravel dataset consists of 7 actions (talk phone, drink, scratch head, turn around, check watch, clap, cross arms) performed by 12 actors in 6 trials each.…”
Section: Methodsmentioning
confidence: 99%
“…To evaluate the performance of the proposed binocular method and compare it with a state-of-the-art monocular method [3], we use the Ravel 2 dataset [24]. The Ravel dataset consists of 7 actions (talk phone, drink, scratch head, turn around, check watch, clap, cross arms) performed by 12 actors in 6 trials each.…”
Section: Methodsmentioning
confidence: 99%
“…For this, we selected the "Robot Gesture" scenario of the RAVEL dataset [10]. We evaluated the methods splitting actor-wise the dataset into a training subset and a testing subset several times, following a standard cross-validation strategy.…”
Section: Methodsmentioning
confidence: 99%
“…For dynamic scenarios, the AV16.3 dataset [29] and [30] involve multiple moving human talkers. The RAVEL and CAMIL datasets [31], [32] provide camera and microphone recordings from a rotating robot head. However, annotation of the ground-truth source positions is typically performed in a semi-automatic manner, where humans label bounding boxes on small video segments.…”
Section: Locata Challenge Tasksmentioning
confidence: 99%