Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems 2021
DOI: 10.1145/3411764.3445400
|View full text |Cite
|
Sign up to set email alerts
|

Collecting and Characterizing Natural Language Utterances for Specifying Data Visualizations

Abstract: Natural language interfaces (NLIs) for data visualization are becoming increasingly popular both in academic research and in commercial software. Yet, there is a lack of empirical understanding of how people specify visualizations through natural language.We conducted an online study (N = 102), showing participants a series of visualizations and asking them to provide utterances they would pose to generate the displayed charts. From the responses, we curated a dataset of 893 utterances and characterized the ut… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
25
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 40 publications
(26 citation statements)
references
References 30 publications
1
25
0
Order By: Relevance
“…Given that the application domain for these chatbot interactions uses a set of known analytical intents along with attributes and values from the underlying data, the space of linguistic variations is relatively small and the outputs can be specified using templates [55]. We define the templates by referring to utterances from Study 1, along with utterances commonly supported across existing NLIs [40,52,62,64,76] and sample utterances collected through studies investigating the use of NL to create or interact with data visualizations [68,72]. The grammar rules from the parser modules are used to aid in the NLG process, which involves ordering constituents of the NLG output and The Slack chatbot uses the Slack API [11] for listening to Slack events.…”
Section: • Viz Module: Generates Images Of Data Visualization Resultsmentioning
confidence: 99%
“…Given that the application domain for these chatbot interactions uses a set of known analytical intents along with attributes and values from the underlying data, the space of linguistic variations is relatively small and the outputs can be specified using templates [55]. We define the templates by referring to utterances from Study 1, along with utterances commonly supported across existing NLIs [40,52,62,64,76] and sample utterances collected through studies investigating the use of NL to create or interact with data visualizations [68,72]. The grammar rules from the parser modules are used to aid in the NLG process, which involves ordering constituents of the NLG output and The Slack chatbot uses the Slack API [11] for listening to Slack events.…”
Section: • Viz Module: Generates Images Of Data Visualization Resultsmentioning
confidence: 99%
“…For SQL, we use the Spider dataset (Yu et al, 2018). For Vega-Lite, we use the NLV Corpus (Srinivasan et al, 2021). For SMCalFlow, we use the dataset that introduced the language (Andreas et al, 2020).…”
Section: Methodsmentioning
confidence: 99%
“…Having a deterministic set of generated NL output also allowed us to control the variability in the recommended NL utterances for testing purposes. We defined the templates by referring to utterances commonly supported across existing NLIs [35,46,54,57,79] through studies investigating the use of NL to create or interact with data visualizations [62,67]. Note, however, that the current template-based approach can be extended to a task-oriented dialogue approach by using the set of templates along with a language model for generating a larger variety of sentences with linguistic variability.…”
Section: Parameterizationmentioning
confidence: 99%