Popular dialog data sets such as MultiWOZ (Budzianowski et al., 2018) are created by providing crowd workers an instruction, expressed in natural language, that describes the task to be accomplished. Crowd workers play the role of a user and an agent to generate dialogs to accomplish tasks involving booking restaurant tables, calling a taxi etc. In this paper, we present a data creation strategy that uses the pre-trained language model, GPT2 (Radford et al., 2018), to simulate the interaction between crowd workers by creating a user bot and an agent bot. We train the simulators using a smaller percentage of actual crowd-generated conversations and their corresponding instructions. We demonstrate that by using the simulated data, we achieve significant improvements in low-resource settings on two publicly available datasets -MultiWOZ dataset (Budzianowski et al., 2018) and the Persona chat dataset (Zhang et al., 2018a).
Knowledge Graph (KG) embeddings provide a low-dimensional representation of entities and relations of a Knowledge Graph and are used successfully for various applications such as question answering and search, reasoning, inference, and missing link prediction. However, most of the existing KG embeddings only consider the network structure of the graph and ignore the semantics and the characteristics of the underlying ontology that provides crucial information about relationships between entities in the KG. Recent efforts in this direction involve learning embeddings for a Description Logic (logical underpinning for ontologies) named EL ++ . However, such methods consider all the relations defined in the ontology to be one-to-one which severly limits their performance and applications. We provide a simple and effective solution to overcome this shortcoming that allows such methods to consider many-to-many relationships while learning embedding representations. Experiments conducted using three different EL ++ ontologies show substantial performance improvement over five baselines. Our proposed solution also paves the way for learning embedding representations for even more expressive description logics such as SROIQ.
Bundling is a technique e-commerce companies have adopted from traditional retail stores to increase the average order size. It has been observed that bargaining helps increase customer satisfaction while increasing the average order revenue for retailers. We propose a mathematical framework to incorporate bargaining capabilities with the product bundles provided by e-commerce websites. Our method creates a virtual agent that uses the modular Bidding-Opponent-Acceptance model for its bargaining strategy and the Thomas-Kilmann conflict mode instrument to model buyer behavior. We incorporate bargaining capabilities with bundles in an e-commerce system by using a negotiation agent that uses business logic for better strategy. It uses real-time data generated during a negotiation session, since the buyer behavior during a negotiation is crucial. No requirement exists for data from past negotiation sessions of the buyer, which removes bias as well as allowing for rapid changes to buyer behavior. The agent behavior can be altered by various hyperparameters. Our model provides utility metrics to measure buyer and agent satisfaction. Our results show that the agent successfully negotiates with humans from diverse backgrounds.
Popular task-oriented dialog data sets such as MultiWOZ (Budzianowski et al. 2018) are created by providing crowdsourced workers a goal instruction, expressed in natural language, that describes the task to be accomplished. Crowdsourced workers play the role of a user and an agent to generate dialogs to accomplish tasks involving booking restaurant tables, making train reservations, calling a taxi etc. However, creating large crowd-sourced datasets can be time consuming and expensive. To reduce the cost associated with generating such dialog datasets, recent work has explored methods to automatically create larger datasets from small samples. In this paper, we present a data creation strategy that uses the pre-trained language model, GPT2 (Radford et al. 2018), to simulate the interaction between crowd-sourced workers by creating a user bot and an agent bot. We train the simulators using a smaller percentage of actual crowd-generated conversations and their corresponding goal instructions. We demonstrate that by using the simulated data, we achieve significant improvements in both low-resource setting as well as in overall task performance. To the best of our knowledge we are the first to present a model for generating entire conversations by simulating the crowd-sourced data collection process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.