Bag-of-Words as Target for Neural Machine Translation

Ma, Shuming; Sun, Xu; Wang, Yizhong; Lin, Junyang

doi:10.18653/v1/p18-2053

Cited by 69 publications

(46 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Under reasoning-required setting, long answers are available in training but not inference phase. We use them as an additional signal for training: similar to Ma et al (2018) regularizing neural machine translation models with binary bag-of-word (BoW) statistics, we fine-tune BioBERT with an auxiliary task of predicting the binary BoW statistics of the long answers, also using the special [CLS] embedding. We minimize binary crossentropy loss of this auxiliary task:…”

Section: Long Answer As Additional Supervisionmentioning

confidence: 99%

PubMedQA: A Dataset for Biomedical Research Question Answering

Jin¹,

Dhingra²,

Liu³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

206

134

View full text Add to dashboard Cite

We introduce PubMedQA, a novel biomedical question answering (QA) dataset collected from PubMed abstracts. The task of Pub-MedQA is to answer research questions with yes/no/maybe (e.g.: Do preoperative statins reduce atrial fibrillation after coronary artery bypass grafting?) using the corresponding abstracts. PubMedQA has 1k expert-annotated, 61.2k unlabeled and 211.3k artificially generated QA instances. Each PubMedQA instance is composed of (1) a question which is either an existing research article title or derived from one, (2) a context which is the corresponding abstract without its conclusion, (3) a long answer, which is the conclusion of the abstract and, presumably, answers the research question, and (4) a yes/no/maybe answer which summarizes the conclusion. Pub-MedQA is the first QA dataset where reasoning over biomedical research texts, especially their quantitative contents, is required to answer the questions. Our best performing model, multi-phase fine-tuning of BioBERT with long answer bag-of-word statistics as additional supervision, achieves 68.1% accuracy, compared to single human performance of 78.0% accuracy and majority-baseline of 55.2% accuracy, leaving much room for improvement. PubMedQA is publicly available at https://pubmedqa.github.io.

show abstract

Section: Long Answer As Additional Supervisionmentioning

confidence: 99%

PubMedQA: A Dataset for Biomedical Research Question Answering

Jin¹,

Dhingra²,

Liu³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

206

134

View full text Add to dashboard Cite

show abstract

“…Text generation is an important task in Natural Language Processing (NLP) as it lays the foundation for many applications, such as dialogue generation, machine translation (Ma et al, 2018b;), text summarization (Ma et al, 2018a), and table summarization (Liu et al, 2017). In these tasks, most of the systems are built upon the sequence-to-sequence paradigm (Sutskever et al, 2014), which is an end-to-end model that encodes a source sentence to a dense vector and then decodes the vector to a target sentence.…”

Section: Introductionmentioning

confidence: 99%

Diversity-Promoting GAN: A Cross-Entropy Based Generative Adversarial Network for Diversified Text Generation

Xu¹,

Ren²,

Lin³

et al. 2018

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Self Cite

117

View full text Add to dashboard Cite

Existing text generation methods tend to produce repeated and "boring" expressions. To tackle this problem, we propose a new text generation model, called Diversity-Promoting Generative Adversarial Network (DP-GAN). The proposed model assigns low reward for repeatedly generated text and high reward for "novel" and fluent text, encouraging the generator to produce diverse and informative text. Moreover, we propose a novel languagemodel based discriminator, which can better distinguish novel text from repeated text without the saturation problem compared with existing classifier-based discriminators. The experimental results on review generation and dialogue generation tasks demonstrate that our model can generate substantially more diverse and informative text than existing baselines. 1

show abstract

“…Bags of words have been used for instance in fraud detection [27]. More recently bag of words have been used successfully for translation by neural nets as a target for the translation as a sentence can be translated in many different ways [28]. In [29], multi-modal bag of words have been used for cross domains sentiment analysis.…”

Section: Multisetsmentioning

confidence: 99%

Exchange-Based Diffusion in Hb-Graphs: Highlighting Complex Relationships

Ouvrard

Goff

Marchand-Maillet

2018

2018 International Conference on Content-Based Multimedia Indexing (CBMI)

View full text Add to dashboard Cite

Most networks tend to show complex and multiple relationships between entities. Networks are usually modeled by graphs or hypergraphs; nonetheless a given entity can occur many times in a relationship: this brings the need to deal with multisets instead of sets or simple edges. Diffusion processes are useful to highlight interesting parts of a network: they usually start with a stroke at one vertex and diffuse throughout the network to reach a uniform distribution. Several iterations of the process are required prior to reaching a stable solution. We propose an alternative solution to highlighting the main components of a network using a diffusion process based on exchanges: it is an iterative two-phase step exchange process. This process allows to evaluate the importance not only of the vertices but also of the regrouping level. To model the diffusion process, we extend the concept of hypergraphs that are families of sets to families of multisets, that we call hb-graphs.Keywords exchange · diffusion · multiset · hyperbag-graph · information retrieval · ranking This article is an extended version of [1] (pre-printed in arXiv:1809.00190v1): the text of the extended version is in blue, the text in black is the one of [1]. All the figures except Figure 2 have been either modified or added in this extended version to take into account the new developments. The contributions of this extended version are: the proofs of conservation and convergence of the extracted sequences of the diffusion process, as well as the illustration of the speed of convergence and comparison to classical and modified random walks; the algorithms of the exchange-based diffusion and the modified random walk; the application to a use case based on Arxiv publications.

show abstract

Bag-of-Words as Target for Neural Machine Translation

Cited by 69 publications

References 19 publications

PubMedQA: A Dataset for Biomedical Research Question Answering

PubMedQA: A Dataset for Biomedical Research Question Answering

Diversity-Promoting GAN: A Cross-Entropy Based Generative Adversarial Network for Diversified Text Generation

Exchange-Based Diffusion in Hb-Graphs: Highlighting Complex Relationships

Contact Info

Product

Resources

About