This paper presents a new sequence-tosequence pre-training model called Prophet-Net, which introduces a novel self-supervised objective named future n-gram prediction and the proposed n-stream self-attention mechanism.Instead of optimizing one-stepahead prediction in the traditional sequenceto-sequence model, the ProphetNet is optimized by n-step ahead prediction that predicts the next n tokens simultaneously based on previous context tokens at each time step. The future n-gram prediction explicitly encourages the model to plan for the future tokens and prevent overfitting on strong local correlations. We pre-train ProphetNet using a base scale dataset (16GB) and a large-scale dataset (160GB), respectively. Then we conduct experiments on CNN/DailyMail, Gigaword, and SQuAD 1.1 benchmarks for abstractive summarization and question generation tasks. Experimental results show that Prophet-Net achieves new state-of-the-art results on all these datasets compared to the models using the same scale pre-training corpus.
Reading long documents to answer opendomain questions remains challenging in natural language understanding. In this paper, we introduce a new model, called RikiNet, which reads Wikipedia pages for natural question answering. RikiNet contains a dynamic paragraph dual-attention reader and a multi-level cascaded answer predictor. The reader dynamically represents the document and question by utilizing a set of complementary attention mechanisms. The representations are then fed into the predictor to obtain the span of the short answer, the paragraph of the long answer, and the answer type in a cascaded manner. On the Natural Questions (NQ) dataset, a single RikiNet achieves 74.3 F1 and 57.9 F1 on longanswer and short-answer tasks. To our best knowledge, it is the first single model that outperforms the single human performance. Furthermore, an ensemble RikiNet obtains 76.1 F1 and 61.3 F1 on long-answer and shortanswer tasks, achieving the best performance on the official NQ leaderboard 1 .
Incorporating prior knowledge like lexical constraints into the model's output to generate meaningful and coherent sentences has many applications in dialogue system, machine translation, image captioning, etc. However, existing RNN-based models incrementally generate sentences from left to right via beam search, which makes it difficult to directly introduce lexical constraints into the generated sentences. In this paper, we propose a new algorithmic framework, dubbed BFGAN, to address this challenge. Specifically, we employ a backward generator and a forward generator to generate lexically constrained sentences together, and use a discriminator to guide the joint training of two generators by assigning them reward signals. Due to the difficulty of BFGAN training, we propose several training techniques to make the training process more stable and efficient. Our extensive experiments on two large-scale datasets with human evaluation demonstrate that BFGAN has significant improvements over previous methods. arXiv:1806.08097v2 [cs.CL] 7 Sep 2018 BFGAN: 听到地震的 [消息],他踉踉跄跄跑出去,⼏乎是从楼梯上滚下来。 Hearing the earthquake [news], he staggered out and almost rolled down the stairs. BF-MLE: 你没听过祢的好 [消息],⼤家都听不懂他的话。 You haven't heard your good [news], no one could understand him. GRID: 因为 [消息] 已经开始,我们的⽣命已经来临。 Because [news] has begun, our lives have arrived. REAL: ⽼虎下⼭的 [消息] 使整个⼭村骚动起来。 The [news] of the tiger's downhill caused the whole village to stir up. BFGAN: 忍,是⼀种 [坚强],痛苦也是⼀种智慧。Tolerance is a kind of [strong], and pain is also a kind of wisdom. BF-MLE: ⼀个⼈的⽣活就像海洋,只有意志 [坚强] 的海湾。One's life is like an ocean, with only a bay of [strong] will. GRID:本发明 [坚强],耐磨性能强,耐碱性强,耐碱性强,耐碱性强。 The invention of [strong] has strong abrasion resistance and strong alkali resistance. REAL:[坚强],在常⼈眼中似乎难以理解。 In the eyes of ordinary people, it seems hard to understand [strong]. BFGAN: 真正的 [天才] ,只不过是把别⼈喝咖啡的功夫都⽤在⼯作上了。 The real [genius] is simply putting other people's coffee time to work. BF-MLE: 他是⼀个好 [天才] ,吃得不伦不类,没什么事情都⼲不出来。 He is a good [genius], eating neither fish nor fowl, no matter what he can not do. GRID: 我最喜欢秋 [天才] 更喜欢秋意绵绵的河海校园。 I like autumn best. I prefer the river and sea campus with autumn scenery. REAL: 在要求 [天才] 的产⽣之前,应该先要求可以使天才⽣长的民众。 Before asking for [genius], we should ask for the people who can make it grow. BFGAN: 敌⼈被我军打得敌⼈ [抱头⿏窜], 丢盔弃甲。 The enemy threw away shield and armor, [covered their head and scurried away]. BF-MLE: 敌⼈要给 [抱头⿏窜] 的敌⼈以喘息之机。 Enemies will give [scared] enemies a chance to kick their breath. GRID: 只有 [抱头⿏窜] 的匪徒,才能有所不为。 Only the [scared] bandits are able to do nothing.
As an important format of multimedia, music has filled almost everyone's life. Automatic analyzing music is a significant step to satisfy people's need for music retrieval and music recommendation in an effortless way. Thereinto, downbeat tracking has been a fundamental and continuous problem in Music Information Retrieval (MIR) area. Despite significant research efforts, downbeat tracking still remains a challenge. Previous researches either focus on feature engineering (extracting certain features by signal processing, which are semi-automatic solutions); or have some limitations: they can only model music audio recordings within limited time signatures and tempo ranges. Recently, deep learning has surpassed traditional machine learning methods and has become the primary algorithm in feature learning; the combination of traditional and deep learning methods also has made better performance. In this paper, we begin with a background introduction of downbeat tracking problem. Then, we give detailed discussions of the following topics: system architecture, feature extraction, deep neural network algorithms, datasets, and evaluation strategy. In addition, we take a look at the results from the annual benchmark evaluation-Music Information Retrieval Evaluation eXchange (MIREX), as well as the developments in software implementations. Although much has been achieved in the area of automatic downbeat tracking, Jiancheng Lv some problems still remain. We point out these problems and conclude with possible directions and challenges for future research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.