DLGNet: A Transformer-based Model for Dialogue Response Generation

Oluwatobi, Olabiyi; Mueller, Erik T.

doi:10.18653/v1/2020.nlp4convai-1.7

Cited by 24 publications

(11 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Nevertheless, generating such responses in dialogue is still an open problem using current state-of-the-art language models, such as LLM's. Though LLM's generate realistic text outputs, allowing free-form natural language responses in dialogue is unpredictable, often forgets context, and can lead to incorrect or, in the worst case, offensive responses [97,19,65,98]. In settings where it is not acceptable to have unpredictable and potentially harmful responses, it may not be appropriate to use dynamically generated responses by LLM's.…”

Section: Responding Appropriately In Dialoguementioning

confidence: 99%

Rethinking Explainability as a Dialogue: A Practitioner's Perspective

Lakkaraju¹,

Slack²,

Chen³

et al. 2022

Preprint

View full text Add to dashboard Cite

As practitioners increasingly deploy machine learning models in critical domains such as healthcare, finance, and policy, it becomes vital to ensure that domain experts function effectively alongside these models. Explainability is one way to bridge the gap between human decision-makers and machine learning models. However, most of the existing work on explainability focuses on one-off, static explanations like feature importances or rule-lists. These sorts of explanations may not be sufficient for many use cases that require dynamic, continuous discovery from stakeholders that have a range of skills and expertise. In the literature, few works ask decision-makers such as doctors, healthcare professionals, and policymakers about the utility of existing explanations and other desiderata they would like to see in an explanation going forward. In this work, we address this gap and carry out a study where we interview doctors, healthcare professionals, and policymakers about their needs and desires for explanations. Our study indicates that decision-makers would strongly prefer interactive explanations. In particular, they would prefer these interactions to take the form of natural language dialogues. Domain experts wish to treat machine learning models as "another colleague", i.e., one who can be held accountable by asking why they made a particular decision through expressive and accessible natural language interactions. Considering these needs, we outline a set of five principles researchers should follow when designing interactive explanations as a starting place for future work. Further, we show why natural language dialogues satisfy these principles and are a desirable way to build interactive explanations. Next, we provide a design of a dialogue system for explainability, and discuss the risks, trade-offs, and research opportunities of building these systems. Overall, we hope our work serves as a starting place for researchers and engineers to design interactive, natural language dialogue systems for explainability that better serve users' needs.

show abstract

Section: Responding Appropriately In Dialoguementioning

confidence: 99%

Rethinking Explainability as a Dialogue: A Practitioner's Perspective

Lakkaraju¹,

Slack²,

Chen³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Possible reasons for getting more negative or less positive outputs from models could be from two aspects: the transformer-based model and the dataset size for fine-tuning. The OpenAI GPT-2 is a transformer-based model [102]. The transformer is a model architecture that can directly model the dependency between every two words in a sequence to allow the model to learn language representations and generate outputs like the natural language [46].…”

Section: Discussionmentioning

confidence: 99%

An Evaluation of Generative Pre-Training Model-based Therapy Chatbot for Caregivers

Wang,

Mujib,

Williams

et al. 2021

Preprint

View full text Add to dashboard Cite

With the advent of off-the-shelf intelligent home products and broader internet adoption, researchers increasingly explore smart computing applications that provide easier access to health and wellness resources. AI-based systems like chatbots have the potential to provide services that could provide mental health support. However, existing therapy chatbots are often retrieval-based, requiring users to respond with a constrained set of answers, which may not be appropriate given that such pre-determined inquiries may not reflect each patient's unique circumstances. Generative-based approaches, such as the OpenAI GPT models, could allow for more dynamic conversations in therapy chatbot contexts than previous approaches. To investigate the generative-based model's potential in therapy chatbot contexts, we built a chatbot using the GPT-2 model. We fine-tuned it with 306 therapy session transcripts between family caregivers of individuals with dementia and therapists conducting Problem Solving Therapy. We then evaluated the model's pre-trained and the fine-tuned model in terms of basic qualities using three meta-information measurements: the proportion of non-word outputs, the length of response, and sentiment components. Results showed that: (1) the fine-tuned model created more non-word outputs than the pre-trained model; (2) the fine-tuned model generated outputs whose length was more similar to that of the therapists compared to the pre-trained model; (3) both the pre-trained model and fine-tuned model were likely to generate more negative and fewer positive outputs than the therapists. We discuss potential reasons for the problem, the implications, and solutions for developing therapy chatbots and call for investigations of the AI-based system application.

show abstract

“…However, using length normalized log-likelihoods (Brown et al, 2020) has become standard due to its superior performance, and is commonly used for generation tasks (Mao et al, 2019;Oluwatobi and Mueller, 2020). For causal language models, e.g., GPT-2 and GPT-3, Equation 1 can be decomposed as:…”

Section: Standard Methodsmentioning

confidence: 99%

Surface Form Competition: Why the Highest Probability Answer Isn't Always Right

Holtzman¹,

West²,

Shwartz³

et al. 2021

Preprint

View full text Add to dashboard Cite

Large language models have shown promising results in zero-shot settings (Brown et al., 2020;Radford et al., 2019). For example, they can perform multiple choice tasks simply by conditioning on a question and selecting the answer with the highest probability.However, ranking by string probability can be problematic due to surface form competition-wherein different surface forms compete for probability mass, even if they represent the same underlying concept, e.g. "computer" and "PC." Since probability mass is finite, this lowers the probability of the correct answer, due to competition from other strings that are valid answers (but not one of the multiple choice options).We introduce Domain Conditional Pointwise Mutual Information, an alternative scoring function that directly compensates for surface form competition by simply reweighing each option according to a term that is proportional to its a priori likelihood within the context of the specific zero-shot task. It achieves consistent gains in zero-shot performance over both calibrated (Zhao et al., 2021) and uncalibrated scoring functions on all GPT-2 and GPT-3 models on a variety of multiple choice datasets. 1

show abstract

DLGNet: A Transformer-based Model for Dialogue Response Generation

Cited by 24 publications

References 13 publications

Rethinking Explainability as a Dialogue: A Practitioner's Perspective

Rethinking Explainability as a Dialogue: A Practitioner's Perspective

An Evaluation of Generative Pre-Training Model-based Therapy Chatbot for Caregivers

Surface Form Competition: Why the Highest Probability Answer Isn't Always Right

Contact Info

Product

Resources

About