Comparative Error Analysis of Dialog State Tracking

Smith, Ronnie W.

doi:10.3115/v1/w14-4341

Cited by 12 publications

(9 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To our knowledge only one example of a comparative error analysis of dialogue state trackers was given by Smith [12]. In that analysis the author reports the error distributions of tracking models over three error types of possible deviations from the true joint goal for every turn in the dialogue.…”

Section: Error Analysismentioning

confidence: 99%

F-Measure Optimisation and Label Regularisation for Energy-Based Neural Dialogue State Tracking Models

Trinh

Ross

Kelleher

2020

Artificial Neural Networks and Machine Learning – ICANN 2020

View full text Add to dashboard Cite

In recent years many multi-label classification methods have exploited label dependencies to improve performance of classification tasks in various domains, hence casting the tasks to structured prediction problems. We argue that multi-label predictions do not always satisfy domain constraint restrictions. For example when the dialogue state tracking task in task-oriented dialogue domains is solved with multi-label classification approaches, slot-value constraint rules should be enforced following real conversation scenarios. To address these issues we propose an energy-based neural model to solve the dialogue state tracking task as a structured prediction problem. Furthermore we propose two improvements over previous methods with respect to dialogue slotvalue constraint rules: (i) redefining the estimation conditions for the energy network; (ii) regularising label predictions following the dialogue slot-value constraint rules. In our results we find that our extended energy-based neural dialogue state tracker yields better overall performance in term of prediction accuracy, and also behaves more naturally with respect to the conversational rules.

show abstract

Section: Error Analysismentioning

confidence: 99%

F-Measure Optimisation and Label Regularisation for Energy-Based Neural Dialogue State Tracking Models

Trinh

Ross

Kelleher

2020

Artificial Neural Networks and Machine Learning – ICANN 2020

View full text Add to dashboard Cite

show abstract

“…As pointed out by Smith (2014), it is important to examine the types of errors made by a tracker in order to make improvements. To do this, at each turn, we compare the top dialog state output by each tracker with the true dialog state, and examine each slot.…”

Section: What Types Of Errors Do the Trackers Make?mentioning

confidence: 99%

“…Description based on survey collected from participants. (Kim and Banchs, 2014) Linear CRF team3 entry0 (Smith, 2014) Discourse rules + dialog act bigrams team4 entry2 (Henderson et al, 2014d) Recurrent neural network team6 entry2…”

Section: Entrymentioning

confidence: 99%

“…This has had unforeseen benefits: first, the DSTC data now forms a sort of benchmark for the field, with groups continuing to report results on it after the challenge proper (Lee, 2013;Ma and Fosler-Lussier, 2014b;Zilka and Jurčíček, 2015;Fix and Frezza-Buet, 2015). In addition, the DSTC1-3 corpora have been used to examine which state tracking evaluation metrics correlate with dialog success (Lee, 2014), perform detailed error analyses of state trackers (Smith, 2014), and for dialog act classification and SLU experimentation (Ma and Fosler-Lussier, 2014a;Ferreira et al, 2015). We encourage future challenges to continue this tradition.…”

Section: Featuresmentioning

confidence: 99%

See 1 more Smart Citation

The Dialog State Tracking Challenge Series: A Review

Williams

Raux

Henderson

2016

dad

167

119

View full text Add to dashboard Cite

In a spoken dialog system, dialog state tracking refers to the task of correctly inferring the state of the conversation -- such as the user's goal -- given all of the dialog history up to that turn. Dialog state tracking is crucial to the success of a dialog system, yet until recently there were no common resources, hampering progress. The Dialog State Tracking Challenge series of 3 tasks introduced the first shared testbed and evaluation metrics for dialog state tracking, and has underpinned three key advances in dialog state tracking: the move from generative to discriminative models; the adoption of discriminative sequential techniques; and the incorporation of the speech recognition results directly into the dialog state tracker. This paper reviews this research area, covering both the challenge tasks themselves and summarizing the work they have enabled.

show abstract

“…One such example can be "asian food" which appears 16 times in the training data as a part of the best ASR hypothesis while 13 times it really informs about "asian oriental" ontology value. Measurements on dstc2 dev have shown Williams (2014) .739 .721 Henderson et al (2014b) .737 .406 Knowledge-based tracker (Kadlec et al, 2014) .737 .429 √ et al (2014) .735 .433 Smith (2014) .729 .452 Lee et al (2014) .726 .427 YARBUS (Fix and Frezza-buet, 2015) . Table 1: Joint slot tracking accuracy and L2 (denotes the squared L2 norm between the estimated belief distribution and correct distribution) for various systems reported in the literature.…”

Section: Lessons Learnedmentioning

confidence: 99%

Hybrid Dialog State Tracker with ASR Features

Vodolán

Kadlec

Kleindienst

2017

Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 2

View full text Add to dashboard Cite

This paper presents a hybrid dialog state tracker enhanced by trainable Spoken Language Understanding (SLU) for slotfilling dialog systems. Our architecture is inspired by previously proposed neuralnetwork-based belief-tracking systems. In addition we extended some parts of our modular architecture with differentiable rules to allow end-to-end training. We hypothesize that these rules allow our tracker to generalize better than pure machinelearning based systems. For evaluation we used the Dialog State Tracking Challenge (DSTC) 2 dataset -a popular belief tracking testbed with dialogs from restaurant information system. To our knowledge, our hybrid tracker sets a new stateof-the-art result in three out of four categories within the DSTC2.

show abstract

Comparative Error Analysis of Dialog State Tracking

Cited by 12 publications

References 7 publications

F-Measure Optimisation and Label Regularisation for Energy-Based Neural Dialogue State Tracking Models

F-Measure Optimisation and Label Regularisation for Energy-Based Neural Dialogue State Tracking Models

The Dialog State Tracking Challenge Series: A Review

Hybrid Dialog State Tracker with ASR Features

Contact Info

Product

Resources

About