Task-Oriented Dialogue as Dataflow Synthesis

Andreas, Jacob; Bufe, John; Burkett, David; Chen, Charles; Clausman, Josh; Crawford, Jean; Crim, Kate; DeLoach, Jordan; Dorner, Leah; Eisner, Jason; Fang, Hao; Guo, Alan J. X.; Hall, David R.; Hayes, Kristin; Hill, Kellie; Ho, Diana; Iwaszuk, Wendy; Jha, Smriti; Klein, Dan; Krishnamurthy, Jayant; Lanman, Theo; Liang, Percy; Lin, Christopher H.; Lintsbakh, Ilya; McGovern, Andy; Nisnevich, Aleksandr; Pauls, Adam; Petters, Dmitrij; Read, Brent; Roth, Dan; Roy, Subhro; Rusak, Jesse; Short, Beth; Slomin, Div; Snyder, Ben; Striplin, Stephon; Su, Yu; Tellman, Zachary; Thomson, Sam; Vorobev, A. G.; Witoszko, Izabela; Wolfe, J. P.; Wray, Abby; Zhang, Yuchen; Zotov, Alexander

doi:10.1162/tacl_a_00333

Cited by 63 publications

(75 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As mentioned above, we show that our single model-based approach can accurately generate both the appropriate response as well as predict the correct API call at the right time. Earlier work by Andreas et al (2020) and Hosseini-Asl et al (2020) employs a similar modeling approach to predict dialog state in task-based dialogs, which can be seen as a precursor to our API call prediction strategy. The TicketTalk movie ticketing dataset was created using the self-dialog collection method (Krause et al, 2017;Moghe et al, 2018;Byrne et al, 2019) in which a paid crowd-sourced worker writes both sides of the dialog (i.e.…”

Section: Modular Vs End-to-end Architecturesmentioning

confidence: 99%

TicketTalk: Toward human-level performance with end-to-end, transaction-based dialog systems

Byrne¹,

Krishnamoorthi²,

Ganesh³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

We present a data-driven, end-to-end approach to transaction-based dialog systems that performs at near-human levels in terms of verbal response quality and factual grounding accuracy. We show that two essential components of the system produce these results: a sufficiently large and diverse, in-domain labeled dataset, and a neural network-based, pretrained model that generates both verbal responses and API call predictions. In terms of data, we introduce TicketTalk, a movie ticketing dialog dataset with 23,789 annotated conversations. The movie ticketing conversations range from completely open-ended and unrestricted to more structured, both in terms of their knowledge base, discourse features, and number of turns. In qualitative human evaluations, model-generated responses trained on just 10,000 TicketTalk dialogs were rated to "make sense" 86.5% of the time, almost the same as human responses in the same contexts. Our simple, API-focused annotation schema results in a much easier labeling task making it faster and more cost effective. It is also the key component for being able to predict API calls accurately. We handle factual grounding by incorporating API calls in the training data, allowing our model to learn which actions to take and when. Trained on the same 10,000-dialog set, the model's API call predictions were rated to be correct 93.9% of the time in our evaluations, surpassing the ratings for the corresponding human labels. We show how API prediction and response generation scores improve as the dataset size incrementally increases from 5000 to 21,000 dialogs. Our analysis also clearly illustrates the benefits of pre-training. To facilitate future work on transaction-based dialog systems, we have published the TicketTalk dataset at https:// git.io/JL8an.

show abstract

Section: Modular Vs End-to-end Architecturesmentioning

confidence: 99%

TicketTalk: Toward human-level performance with end-to-end, transaction-based dialog systems

Byrne¹,

Krishnamoorthi²,

Ganesh³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

show abstract

“…Our method might be extended to task-oriented dialog in a number of domains, for example, SMCalFlow (Andreas et al, 2020), where data sparsity often poses a problem.…”

Section: Comparison To Lexical Substitutionmentioning

confidence: 99%

Iterative Paraphrastic Augmentation with Discriminative Span Alignment

Culkin

Stengel-Eskin

et al. 2021

Transactions of the Association for Computational Linguistics

View full text Add to dashboard Cite

We introduce a novel paraphrastic augmentation strategy based on sentence-level lexically constrained paraphrasing and discriminative span alignment. Our approach allows for the large-scale expansion of existing datasets or the rapid creation of new datasets using a small, manually produced seed corpus. We demonstrate our approach with experiments on the Berkeley FrameNet Project, a large-scale language understanding effort spanning more than two decades of human labor. With four days of training data collection for a span alignment model and one day of parallel compute, we automatically generate and release to the community 495,300 unique (Frame,Trigger) pairs in diverse sentential contexts, a roughly 50-fold expansion atop FrameNet v1.7. The resulting dataset is intrinsically and extrinsically evaluated in detail, showing positive results on a downstream task.

show abstract

“…1 Furthermore, we assume that program prediction is local in that it does not require program fragments to be copied from the dialogue history (but may still depend on history in other ways). Several formalisms, including the typed references of Zettlemoyer and Collins (2009) and the meta-computation operators of Semantic Machines et al (2020), make it possible to produce local program annotations even for dialogues like the one depicted in Figure 2, which reuse past computations. We transformed the datasets in our experiments to use such metacomputation operators (see Appendix C).…”

Section: Preliminariesmentioning

confidence: 99%

“…To compare with prior work for SMCALFLOW (Semantic Machines et al, 2020) and TREEDST (Cheng et al, 2020), we replicated their setups. For SMCALFLOW, we predict plans always conditioning on the gold dialogue history for each utterance, but we consider any predicted plan wrong if the refer are correct flag is set to false.…”

Section: Evaluation Detailsmentioning

confidence: 99%

Value-Agnostic Conversational Semantic Parsing

Platanios¹,

Pauls²,

Roy³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

Self Cite

View full text Add to dashboard Cite

Conversational semantic parsers map user utterances to executable programs given dialogue histories composed of previous utterances, programs, and system responses. Existing parsers typically condition on rich representations of history that include the complete set of values and computations previously discussed. We propose a model that abstracts over values to focus prediction on type-and function-level context. This approach provides a compact encoding of dialogue histories and predicted programs, improving generalization and computational efficiency. Our model incorporates several other components, including an atomic span copy operation and structural enforcement of well-formedness constraints on predicted programs, that are particularly advantageous in the low-data regime. Trained on the SMCALFLOW and TREEDST datasets, our model outperforms prior work by 7.3% and 10.6% respectively in terms of absolute accuracy. Trained on only a thousand examples from each dataset, it outperforms strong baselines by 12.4% and 6.4%. These results indicate that simple representations are key to effective generalization in conversational semantic parsing.Consider the following program representing the expression 1 + 2 + 3 + 4 + 5:While generating this invocation, the decoder only gets to condition on the following program prefix: Argument values are masked out!

show abstract

Task-Oriented Dialogue as Dataflow Synthesis

Cited by 63 publications

References 27 publications

TicketTalk: Toward human-level performance with end-to-end, transaction-based dialog systems

TicketTalk: Toward human-level performance with end-to-end, transaction-based dialog systems

Iterative Paraphrastic Augmentation with Discriminative Span Alignment

Value-Agnostic Conversational Semantic Parsing

Contact Info

Product

Resources

About