Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.635
|View full text |Cite
|
Sign up to set email alerts
|

KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation

Abstract: The research of knowledge-driven conversational systems is largely limited due to the lack of dialog data which consists of multi-turn conversations on multiple topics and with knowledge annotations. In this paper, we propose a Chinese multi-domain knowledge-driven conversation dataset, KdConv, which grounds the topics in multi-turn conversations to knowledge graphs. Our corpus contains 4.5K conversations from three domains (film, music, and travel), and 86K utterances with an average turn number of 19.0. Thes… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
56
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2
2

Relationship

2
7

Authors

Journals

citations
Cited by 81 publications
(57 citation statements)
references
References 33 publications
1
56
0
Order By: Relevance
“…Wu et al (2019) provides a Chinese dialog dataset-DuConv, where one participant can proactively lead the conversation with an explicit goal. KdConv (Zhou et al, 2020) is a Chinese dialog dataset, where each dialog contains in-depth discussions on multiple topics. In comparison with them, our dataset contains multiple dialog types, clear goals to achieve during each conversation, and user profiles for personalized conversation.…”
Section: Introductionmentioning
confidence: 99%
“…Wu et al (2019) provides a Chinese dialog dataset-DuConv, where one participant can proactively lead the conversation with an explicit goal. KdConv (Zhou et al, 2020) is a Chinese dialog dataset, where each dialog contains in-depth discussions on multiple topics. In comparison with them, our dataset contains multiple dialog types, clear goals to achieve during each conversation, and user profiles for personalized conversation.…”
Section: Introductionmentioning
confidence: 99%
“…We used two publicly available multi-turn dialogue datasets, one is DailyDialog [25], an English dialogue dataset between people in daily life, and the other is KdConv [26], Chinese multi-domain knowledge-driven dialogue datasets. DailyDialog contains 11,318 hand-written dialogues which cover a variety of topics in our daily life.…”
Section: Datasetsmentioning
confidence: 99%
“…Recently, a variety of neural models have been proposed to facilitate knowledge-grounded conversation generation (Zhu et al, 2017;Young et al, 2018;Zhou et al, 2018a;Liu et al, 2018). The research topic is also greatly advanced by many corpora (Zhou et al, 2018b;Moghe et al, 2018;Dinan et al, 2019;Gopalakrishnan et al, 2019;Moon et al, 2019;Tuan et al, 2019;Zhou et al, 2020). As surveyed in , existing studies have been mainly devoted to addressing two research problems: (1) knowledge selection: selecting appropriate knowledge given the dialog context and previously selected knowledge Meng et al, 2020;Kim et al, 2020); and…”
Section: Knowledge-grounded Dialog Generationmentioning
confidence: 99%
“…In addition, recently there emerge a number of works that propose RL-based models to select a path in structured knowledge graph (KG) (Xu et al, 2020a,b), which also select knowledge in a sequential way. While our method is designed to ground the conversation to unstructured knowledge text, we will leave as future work the application of our method to such KG-grounded dialog generation tasks Moon et al, 2019;Zhou et al, 2020).…”
Section: Sequential Knowledge Selectionmentioning
confidence: 99%