Dear Sir or Madam, May I introduce the GYAFC Dataset: Corpus, Benchmarks and Metrics for Formality Style Transfer

Rao, Sudha; Tetreault, Joel

doi:10.48550/arxiv.1803.06535

Cited by 23 publications

(37 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We first overview text style transfer, which aims to transfer a text (typically a single sentence or a short paragraph -for simplicity we refer to simply "sentences" below) from one domain to another while preserving underlying content. For example, formality transfer (Rao & Tetreault, 2018) is the task of transforming the tone of text from informal to formal without changing its content. Other examples include sentiment transfer (Shen et al, 2017), word decipherment (Knight et al, 2006), and author imitation (Xu et al, 2012).…”

Section: Unsupervised Text Style Transfermentioning

confidence: 99%

“…Next, we consider a harder task of modifying the formality of a sequence. We use the GYAFC dataset (Rao & Tetreault, 2018), which contains formal and informal sentences from two different domains. In this paper, we use the Entertainment and Music domain, which has about 52K training sentences, 5K development sentences, and 2.5K test sentences.…”

Section: Datasets and Experiments Setupmentioning

confidence: 99%

“…style rather than content. 2 We focus on a standard suite of style transfer tasks, including formality transfer (Rao & Tetreault, 2018), author imitation (Xu et al, 2012), word decipherment (Shen et al, 2017), sentiment transfer (Shen et al, 2017), and related language translation (Pourdamghani & Knight, 2017). General unsupervised translation has not typically been considered style transfer, but for the purpose of comparison we also conduct evaluation on this task (Lample et al, 2017).…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

A Probabilistic Formulation of Unsupervised Text Style Transfer

He,

Wang,

Neubig

et al. 2020

Preprint

View full text Add to dashboard Cite

We present a deep generative model for unsupervised text style transfer that unifies previously proposed non-generative techniques. Our probabilistic approach models non-parallel data from two domains as a partially observed parallel corpus. By hypothesizing a parallel latent sequence that generates each observed sequence, our model learns to transform sequences from one domain to another in a completely unsupervised fashion. In contrast with traditional generative sequence models (e.g. the HMM), our model makes few assumptions about the data it generates: it uses a recurrent language model as a prior and an encoder-decoder as a transduction distribution. While computation of marginal data likelihood is intractable in this model class, we show that amortized variational inference admits a practical surrogate. Further, by drawing connections between our variational objective and other recent unsupervised style transfer and machine translation techniques, we show how our probabilistic view can unify some known non-generative objectives such as backtranslation and adversarial loss. Finally, we demonstrate the effectiveness of our method on a wide range of unsupervised style transfer tasks, including sentiment transfer, formality transfer, word decipherment, author imitation, and related language translation. Across all style transfer tasks, our approach yields substantial gains over state-of-the-art non-generative baselines, including the state-of-the-art unsupervised machine translation techniques that our approach generalizes. Further, we conduct experiments on a standard unsupervised machine translation task and find that our unified approach matches the current state-of-the-art. 1 * Equal Contribution. 1 Code and data are available at https://github.com/cindyxinyiwang/deep-latent-sequence-model. 2 Notably, some tasks we evaluate on do change content to some degree, such as sentiment transfer, but for conciseness we use the term "style transfer" nonetheless.

show abstract

Section: Unsupervised Text Style Transfermentioning

confidence: 99%

Section: Datasets and Experiments Setupmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A Probabilistic Formulation of Unsupervised Text Style Transfer

He,

Wang,

Neubig

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…(i) Writing assistance and dialogue (Heidorn, 2000;Ritter et al, 2011). For example, it is helpful to have programs that transfer a formal sentence to an informal sentence (Rao and Tetreault, 2018). It is helpful to have programs that make emails more polite (Sennrich et al, 2016).…”

Section: Problem 1: Style Transfer Tasksmentioning

confidence: 99%

The Daunting Task of Real-World Textual Style Transfer Auto-Evaluation

Pang¹

2019

Preprint

View full text Add to dashboard Cite

The difficulty of textual style transfer lies in the lack of parallel corpora. Numerous advances have been proposed for the unsupervised generation. However, significant problems remain with the auto-evaluation of style transfer tasks. Based on the summary of Pang and Gimpel (2018) and Mir et al. (2019), style transfer evaluations rely on three criteria: style accuracy of transferred sentences, content similarity between original and transferred sentences, and fluency of transferred sentences. We elucidate the problematic current state of style transfer research. Given that current tasks do not represent real use cases of style transfer, current auto-evaluation approach is flawed. This discussion aims to bring researchers to think about the future of style transfer and style transfer evaluation research.

show abstract

“…This is the approach we take in this paper by trying to de-bias the data or suggesting the possibility of de-biasing the data to a humanin-the-loop. A related task is to modify or paraphrase text data to obfuscate gender as in (Reddy and Knight, 2016) Another closely related work is to change the style of the text to different levels of formality as in (Rao and Tetreault, 2018).…”

Section: Past Workmentioning

confidence: 99%

Judging a Book by its Description : Analyzing Gender Stereotypes in the Man Bookers Prize Winning Fiction

Madaan,

Mehta,

Mittal

et al. 2018

Preprint

View full text Add to dashboard Cite

The presence of gender stereotypes in many aspects of society is a well-known phenomenon. In this paper, we focus on studying and quantifying such stereotypes and bias in the Man Bookers Prize winning fiction. We consider 275 books shortlisted for Man Bookers Prize between 1969 and 2017. The gender bias is analyzed by semantic modeling of book descriptions on Goodreads. This reveals the pervasiveness of gender bias and stereotype in the books on different features like occupation, introductions and actions associated to the characters in the book.

show abstract

Dear Sir or Madam, May I introduce the GYAFC Dataset: Corpus, Benchmarks and Metrics for Formality Style Transfer

Cited by 23 publications

References 19 publications

A Probabilistic Formulation of Unsupervised Text Style Transfer

A Probabilistic Formulation of Unsupervised Text Style Transfer

The Daunting Task of Real-World Textual Style Transfer Auto-Evaluation

Judging a Book by its Description : Analyzing Gender Stereotypes in the Man Bookers Prize Winning Fiction

Contact Info

Product

Resources

About