Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu 2018
DOI: 10.18653/v1/n18-1012
|View full text |Cite
|
Sign up to set email alerts
|

Dear Sir or Madam, May I Introduce the GYAFC Dataset: Corpus, Benchmarks and Metrics for Formality Style Transfer

Abstract: Style transfer is the task of automatically transforming a piece of text in one particular style into another. A major barrier to progress in this field has been a lack of training and evaluation datasets, as well as benchmarks and automatic metrics. In this work, we create the largest corpus for a particular stylistic transfer (formality) and show that techniques from the machine translation community can serve as strong baselines for future work. We also discuss challenges of using automatic metrics.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
477
0
2

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 296 publications
(481 citation statements)
references
References 28 publications
2
477
0
2
Order By: Relevance
“…A related task is to modify or paraphrase text data to obfuscate gender as in (Reddy and Knight 2016). Another closely related work is to change the style of the text to different levels of formality as in (Rao and Tetreault 2018).…”
Section: Approaches Using Non-parallel Datamentioning
confidence: 99%
See 4 more Smart Citations
“…A related task is to modify or paraphrase text data to obfuscate gender as in (Reddy and Knight 2016). Another closely related work is to change the style of the text to different levels of formality as in (Rao and Tetreault 2018).…”
Section: Approaches Using Non-parallel Datamentioning
confidence: 99%
“…On the other hand, the Formality dataset was part of the Yahoo Answers corpus L6, which was labeled in (Rao and Tetreault 2018). Table 1 shows that the Gender dataset consists of 2.889M sentences per class.…”
Section: Dataset Descriptionmentioning
confidence: 99%
See 3 more Smart Citations