Companion Proceedings of the 2019 World Wide Web Conference 2019
DOI: 10.1145/3308560.3317597
|View full text |Cite
|
Sign up to set email alerts
|

Empirical Analysis of Bias in Voice-based Personal Assistants

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 36 publications
(25 citation statements)
references
References 5 publications
0
25
0
Order By: Relevance
“…Voice variation also plays a role: ASR error distribution differs by speaker background variables such as accent (Zheng et al, 2005), in turn affecting the downstream systems (Harwell, 2018;Lima et al, 2019;Palanica et al, 2019). To emulate speaker variation in the synthetic setting, we use Google English Text-to-Speech to pronounce the XQuAD questions in eight different voices, varying the provided accent and gender settings.…”
Section: Results and Analysismentioning
confidence: 99%
“…Voice variation also plays a role: ASR error distribution differs by speaker background variables such as accent (Zheng et al, 2005), in turn affecting the downstream systems (Harwell, 2018;Lima et al, 2019;Palanica et al, 2019). To emulate speaker variation in the synthetic setting, we use Google English Text-to-Speech to pronounce the XQuAD questions in eight different voices, varying the provided accent and gender settings.…”
Section: Results and Analysismentioning
confidence: 99%
“…Voice variation also plays a role: ASR error distribution differs by speaker background variables such as accent (Zheng et al, 2005), in turn affecting the downstream systems (Harwell, 2018;Lima et al, 2019;Palanica et al, 2019). To emulate speaker variation in the synthetic setting, we use Google English Text-to-Speech to pronounce the XQuAD questions in eight different voices, varying the provided accent and gender settings.…”
Section: Results and Analysismentioning
confidence: 99%
“…By using video instead of actual interaction with the device, we maintain the highest degree of control over the similarity of interaction between participants and devices, thereby increasing the comparability between prototypes. Using natural language to interact with PAs often leads to voice recognition errors that would not be consistent among participants and therefore leading to variability of user experiences and therefore evaluation [24]. We developed six videos for each FiPA version, each showcasing the same three successful and three failed interactions.…”
Section: Methodsmentioning
confidence: 99%