2021
DOI: 10.48550/arxiv.2101.11898
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

HEMVIP: Human Evaluation of Multiple Videos in Parallel

Patrik Jonell,
Youngwoo Yoon,
Pieter Wolfert
et al.

Abstract: In many research areas, for example motion and gesture generation, objective measures alone do not provide an accurate impression of key stimulus traits such as perceived quality or appropriateness. The gold standard is instead to evaluate these aspects through user studies, especially subjective evaluations of video stimuli. Common evaluation paradigms either present individual stimuli to be scored on Likert-type scales, or ask users to compare and rate videos in a pairwise fashion. However, the time and reso… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
4
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

2
0

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 12 publications
(29 reference statements)
1
4
0
Order By: Relevance
“…Although there are similarities, the two orderings are meaningfully different. This, together with the results in [25], reinforces a conclusion that the two studies managed to disentangle aspects of perceived motion quality (human-likeness) from the perceived link between gesture and speech (appropriateness). Figure 5, meanwhile, visualises confidence regions for the median rating as boxes whose horizontal and vertical extents are 2: Box plots visualising the ratings distribution in the two studies.…”
Section: Analysis and Results Of Subjective Evaluationsupporting
confidence: 80%
See 3 more Smart Citations
“…Although there are similarities, the two orderings are meaningfully different. This, together with the results in [25], reinforces a conclusion that the two studies managed to disentangle aspects of perceived motion quality (human-likeness) from the perceived link between gesture and speech (appropriateness). Figure 5, meanwhile, visualises confidence regions for the median rating as boxes whose horizontal and vertical extents are 2: Box plots visualising the ratings distribution in the two studies.…”
Section: Analysis and Results Of Subjective Evaluationsupporting
confidence: 80%
“…These speech segments, which were not revealed to participants, were selected across the test inputs to be full and/or coherent phrases. The motion from the corresponding intervals in the BVH files submitted by participating teams was extracted and converted to a motion video originates from [25], and was changed for each of the two evaluations in this paper. clip using the visualisation server provided to participants (see Section 5.1), albeit at a higher resolution of 960×540 this time.…”
Section: Stimulimentioning
confidence: 99%
See 2 more Smart Citations
“…Each evaluation page presented the videos of four conditions for the same speech, and a participant rated each video on a scale of 0-100. This evaluation method [17] was inspired by MUSHRA [42], which is the standardized evaluation method for comparing audio qualities. We evaluated two different aspects of gestures as the GENEA Challenge did.…”
Section: Subjective Gesture Quality Evaluationmentioning
confidence: 99%