2022

DOI: 10.48550/arxiv.2211.15089

|View full text |Cite

Preprint

|

Sign up to set email alerts

|

Continuous diffusion for categorical data

Sander Dieleman¹,

Laurent Sartran²,

Arman Roshannai³

et al.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Methods5

Citation Types

Supporting

0

Mentioning

13

Contrasting

0

Year Published

2023

2023

2024

2024

Publication Types

Select...

Article2

Preprint2

Other1

Book1

Relationship

Self Cite0

Independent6

Authors

Journals

Cited by 10 publications

(13 citation statements)

References 0 publications

Supporting

0

Mentioning

13

Contrasting

0

Order By: Relevance

“…(3) Previous diffusion-based sequence generative models, including the vanilla design that simply extends the original DiffusionLM with an additinal condition encoder, and the other recently proposed improved methods CDCD (continuous diffusion for categorical data, Dieleman et al, 2022), DiffuSeq (Gong et al, 2022), SeqDiffuSeq (Yuan et al, 2022) and Difformer (Gao et al, 2022). For text simplification and paraphrasing, we compare our method with DiffuSeq (Gong et al, 2022).…”

Section: Methodsmentioning

confidence: 99%

“…Metrics. We primarily report SacreBLEU 7 (Post, 2018) for machine translation, following CDCD (Dieleman et al, 2022). We also report tokenized BLEU (Papineni et al, 2002) in Appendix B for reference.…”

Section: Methodsmentioning

confidence: 99%

“…The embedding dimension for the diffusion model is 16 on IWSLT14 and 64 on the others. In the implementation of our method, we follow recent advances and apply self-conditioning techniques (Dieleman et al, 2022;Chen et al, 2022;Strudel et al, 2022). Besides, following previous practice in non-autoregressive machine translation, we train our model both with and without knowledge distillation (KD, Kim & Rush, 2016;Zhou et al, 2020).…”

Section: Methodsmentioning

confidence: 99%

“…For text simplification and paraphrasing, we report results with various length beams as length prediction on these tasks is more challenging and less studied. For all the diffusion-based methods, we follow previous work Gong et al, 2022;Dieleman et al, 2022) and apply Minimum Bayes-Risk (MBR) decoding (Kumar & Byrne, 2004). For both DiffusionLM and our model, we perform sampling with 20 steps.…”

Section: Methodsmentioning

confidence: 99%

“…The best NAR results without KD are in bold and the second best ones are underlined. The results of CDCD are quoted from Dieleman et al (2022), and the results of Difformer, DiffuSeq, and SeqDiffuSeq are from Gao et al (2022). †: how CMLM originally selects candidates with different lengths differs from the MBR decoding we used for diffusion models, and we thus include its results with MBR decoding for fair comparisons.…”

Section: Methodsmentioning

confidence: 99%

See 4 more Smart Citations

DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises

Ye¹,

Zhou²,

Yu³

et al. 2023

Preprint

View full text Add to dashboard Cite

While diffusion models have achieved great success in generating continuous signals such as images and audio, it remains elusive for diffusion models in learning discrete sequence data like natural languages. Although recent advances circumvent this challenge of discreteness by embedding discrete tokens as continuous surrogates, they still fall short of satisfactory generation quality. To understand this, we first dive deep into the denoised training protocol of diffusion-based sequence generative models and determine their three severe problems, i.e., 1) failing to learn, 2) lack of scalability, and 3) neglecting source conditions. We argue that these problems can be boiled down to the pitfall of the not completely eliminated discreteness in the embedding space, and the scale of noises is decisive herein. In this paper, we introduce DINOISER to facilitate diffusion models for sequence generation by manipulating noises. We propose to adaptively determine the range of sampled noise scales for counterdiscreteness training; and encourage the proposed diffused sequence learner to leverage source conditions with amplified noise scales during inference. Experiments show that DINOISER enables consistent improvement over the baselines of previous diffusion-based sequence generative models on several conditional sequence modeling benchmarks thanks to both effective training and inference strategies. Analyses further verify that DINOISER can make better use of source conditions to govern its generative process.

“…(3) Previous diffusion-based sequence generative models, including the vanilla design that simply extends the original DiffusionLM with an additinal condition encoder, and the other recently proposed improved methods CDCD (continuous diffusion for categorical data, Dieleman et al, 2022), DiffuSeq (Gong et al, 2022), SeqDiffuSeq (Yuan et al, 2022) and Difformer (Gao et al, 2022). For text simplification and paraphrasing, we compare our method with DiffuSeq (Gong et al, 2022).…”

Section: Methodsmentioning

confidence: 99%

“…Metrics. We primarily report SacreBLEU 7 (Post, 2018) for machine translation, following CDCD (Dieleman et al, 2022). We also report tokenized BLEU (Papineni et al, 2002) in Appendix B for reference.…”

Section: Methodsmentioning

confidence: 99%

“…The embedding dimension for the diffusion model is 16 on IWSLT14 and 64 on the others. In the implementation of our method, we follow recent advances and apply self-conditioning techniques (Dieleman et al, 2022;Chen et al, 2022;Strudel et al, 2022). Besides, following previous practice in non-autoregressive machine translation, we train our model both with and without knowledge distillation (KD, Kim & Rush, 2016;Zhou et al, 2020).…”

Section: Methodsmentioning

confidence: 99%

“…For text simplification and paraphrasing, we report results with various length beams as length prediction on these tasks is more challenging and less studied. For all the diffusion-based methods, we follow previous work Gong et al, 2022;Dieleman et al, 2022) and apply Minimum Bayes-Risk (MBR) decoding (Kumar & Byrne, 2004). For both DiffusionLM and our model, we perform sampling with 20 steps.…”

Section: Methodsmentioning

confidence: 99%

“…The best NAR results without KD are in bold and the second best ones are underlined. The results of CDCD are quoted from Dieleman et al (2022), and the results of Difformer, DiffuSeq, and SeqDiffuSeq are from Gao et al (2022). †: how CMLM originally selects candidates with different lengths differs from the MBR decoding we used for diffusion models, and we thus include its results with MBR decoding for fair comparisons.…”

Section: Methodsmentioning

confidence: 99%

See 3 more Smart Citations

DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises

Ye¹,

Zhou²,

Yu³

et al. 2023

Preprint

View full text Add to dashboard Cite

While diffusion models have achieved great success in generating continuous signals such as images and audio, it remains elusive for diffusion models in learning discrete sequence data like natural languages. Although recent advances circumvent this challenge of discreteness by embedding discrete tokens as continuous surrogates, they still fall short of satisfactory generation quality. To understand this, we first dive deep into the denoised training protocol of diffusion-based sequence generative models and determine their three severe problems, i.e., 1) failing to learn, 2) lack of scalability, and 3) neglecting source conditions. We argue that these problems can be boiled down to the pitfall of the not completely eliminated discreteness in the embedding space, and the scale of noises is decisive herein. In this paper, we introduce DINOISER to facilitate diffusion models for sequence generation by manipulating noises. We propose to adaptively determine the range of sampled noise scales for counterdiscreteness training; and encourage the proposed diffused sequence learner to leverage source conditions with amplified noise scales during inference. Experiments show that DINOISER enables consistent improvement over the baselines of previous diffusion-based sequence generative models on several conditional sequence modeling benchmarks thanks to both effective training and inference strategies. Analyses further verify that DINOISER can make better use of source conditions to govern its generative process.

Multistate and functional protein design using RoseTTAFold sequence space diffusion

Lisanza,

Gershon,

Tipps

et al. 2024

Nat Biotechnol

View full text Add to dashboard Cite

No abstract

Andrew of Wyntoun’s Macbeth Episode: A Translation

2023

Macbeth Before Shakespeare

View full text Add to dashboard Cite

Figure 1. Latent Interpolations. SODA learns to encode images into compact latent representations. By traversing its latent space, we can interpolate between images, morphing from one image category to another and smoothly transitioning between semantic attributes.

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Product

Browser Extension Assistant by scite Citation Statement Search Reference Check Visualizations Dashboards Explore Journals Explore Organizations Explore Funders Embedding Badge Embedding Citation Search Pricing

Resources

Blog Help & FAQ Accessibility Statement API Terms For Universities & Governments For Researchers For Publishers For Corporate, Pharma & Enterprise Author Marketing Become an Affiliate Get an organization trial or quote scite Data & Services

About

News & Press Careers Read our Paper Coverage

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Copyright © 2024 scite LLC. All rights reserved.

Made with 💙 for researchers

Part of the Research Solutions Family.