Using Paraphrasing and Memory-Augmented Models to Combat Data
            Sparsity in Question Interpretation with a Virtual Patient Dialogue
            System

Jin, Lifeng; King, David L.; Hussein, Amad; White, Michael; Danforth, Douglas R.

doi:10.18653/v1/w18-0502

Cited by 11 publications

(12 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Increasing numbers of students now spend class time working one-on-one with adaptive learning platforms (Baker, 2016), and in these contexts, multiple students may have questions at the same time, and teachers may not be able to answer all questions at the same time (Schofield, 1995). This challenge has led to the idea of automated question answering systems in education (Louwerse et al, 2002;Corbett et al, 2005;Milik et al, 2006;Jin et al, 2018), where students can ask questions in natural language. Different than simply a search engine, educational question answering systems attempt to provide answers focused on current content, set at an appropriate level for the student's current stage of learning.…”

Section: Introductionmentioning

confidence: 99%

Curio SmartChat : A system for Natural Language Question Answering for Self-Paced K-12 Learning

Raamadhurai¹,

Baker

Poduval³

2019

Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

View full text Add to dashboard Cite

During learning, students often have questions which they would benefit from responses to in real time. In class, a student can ask a question to a teacher. During homework, or even in class if the student is shy, it can be more difficult to receive a rapid response. In this work, we introduce Curio SmartChat, an automated question answering system for middle school Science topics. Our system has now been used by around 20,000 students who have so far asked over 100,000 questions. We present data on the challenge created by students' grammatical errors and spelling mistakes, and discuss our system's approach and degree of effectiveness at disambiguating questions that the system is initially unsure about. We also discuss the prevalence of student "small talk" not related to science topics, the pluses and minuses of this behavior, and how a system should respond to these conversational acts. We conclude with discussions and point to directions for potential future work.

show abstract

Section: Introductionmentioning

confidence: 99%

Curio SmartChat : A system for Natural Language Question Answering for Self-Paced K-12 Learning

Raamadhurai¹,

Baker

Poduval³

2019

Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

View full text Add to dashboard Cite

show abstract

“…At test time, an utternace x test is encoded to obtain r test = f (x test ; θ). 3 For each class, we perform a 1-nearest-neighbor search 4 on the training set using r test and set the corresponding elements of the class score c nn test ∈ R C to be the inverse distance to r test . We also compute the classifier class scores on the unmixed test utterance, c class test = g(r test ; φ).…”

Section: Testingmentioning

confidence: 99%

“…Various methods have been proposed to handling rare classes in this low-resource dataset, including memory and paraphrasing [4], text-to-phonetic data-augmentation [5] and an ensemble of rule-based and deep learning based models [6]. Recently, self-attention has shown to work particularly well for rare class classification [7].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Handling Class Imbalance in Low-Resource Dialogue Systems by Combining Few-Shot Classification and Interpolation

Sunder

Fosler‐Lussier

2021

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

Utterance classification performance in low-resource dialogue systems is constrained by an inevitably high degree of data imbalance in class labels. We present a new end-to-end pairwise learning framework that is designed specifically to tackle this phenomenon by inducing a few-shot classification capability in the utterance representations and augmenting data through an interpolation of utterance representations. Our approach is a general purpose training methodology, agnostic to the neural architecture used for encoding utterances. We show significant improvements in macro-F1 score over standard cross-entropy training for three different neural architectures, demonstrating improvements on a Virtual Patient dialogue dataset as well as a low-resourced emulation of the Switchboard dialogue act classification dataset.

show abstract

“…Educational applications tend to target a specific subject, in other words, a specific domain, such as the medical domain in the case of (Jin et al, 2018). Thus, building these applications with underlying NLP algorithms, would typically require a large domain-specific corpus.…”

Section: Introductionmentioning

confidence: 99%

Equipping Educational Applications with Domain Knowledge

Sakakini

Gong

Lee³

et al. 2019

Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

View full text Add to dashboard Cite

One of the challenges of building natural language processing (NLP) applications for education is finding a large domain-specific corpus for the subject of interest (e.g., history or science). To address this challenge, we propose a tool, Dexter, that extracts a subjectspecific corpus from a heterogeneous corpus, such as Wikipedia, by relying on a small seed corpus and distributed document representations. We empirically show the impact of the generated corpus on language modeling, estimating word embeddings, and consequently, distractor generation, resulting in a better performance than while using a general domain corpus, a heuristically constructed domainspecific corpus, and a corpus generated by a popular system: BootCaT.

show abstract

Using Paraphrasing and Memory-Augmented Models to Combat Data Sparsity in Question Interpretation with a Virtual Patient Dialogue System

Cited by 11 publications

References 16 publications

Curio SmartChat : A system for Natural Language Question Answering for Self-Paced K-12 Learning

Curio SmartChat : A system for Natural Language Question Answering for Self-Paced K-12 Learning

Handling Class Imbalance in Low-Resource Dialogue Systems by Combining Few-Shot Classification and Interpolation

Equipping Educational Applications with Domain Knowledge

Contact Info

Product

Resources

About