Proceedings of the 14th ACM International Conference on Multimodal Interaction 2012
DOI: 10.1145/2388676.2388760
|View full text |Cite
|
Sign up to set email alerts
|

Audio-visual robot command recognition

Abstract: This paper addresses the problem of audio-visual command recognition in the framework of the D-META Grand Challenge 1 . Temporal and non-temporal learning models are trained on visual and auditory descriptors. In order to set a proper baseline, the methods are tested on the "Robot Gestures" scenario of the publicly available RAVEL data set, following the leave-one-out cross-validation strategy. The classification-level audio-visual fusion strategy allows for compensating the errors of the unimodal (audio or vi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2013
2013
2013
2013

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 17 publications
0
2
0
Order By: Relevance
“…The method is tested on a dataset of 900 videos and 12 classes. Similarly, a convex function is used in [6] to combine the unimodal classification. In this case the dataset consists of more than 200 sequences from 9 different classes.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The method is tested on a dataset of 900 videos and 12 classes. Similarly, a convex function is used in [6] to combine the unimodal classification. In this case the dataset consists of more than 200 sequences from 9 different classes.…”
Section: Introductionmentioning
confidence: 99%
“…This quantity is prohibitive for useradaptive methods whose discriminative power should be high when trained on tiny datasets (10-15 instances per class). Both [6,7] deal with such datasets. Albeit, the work in [7] uses sequential forward feature selection, an iterative algorithms that slows down the training process.…”
Section: Introductionmentioning
confidence: 99%