DART: Open-Domain Structured Data Record to Text Generation

Nan, Linyong; Radev, Dragomir; Zhang, Rui; Rau, Amrit; Sivaprasad, Abhinand; Hsieh, Chia‐Chun; Tang, Xiangru; Vyas, Aadit; Verma, Neha; Krishna, Pranav; Liu, Yangxiaokang; Irwanto, Nadia; Pan, Jessica; Rahman, Faiaz; Zaidi, Ahmad Mujahid Ahmad; Mutuma, Mutethia; Tarabar, Yasin; Gupta, Ankit; Chen, Yu; Tan, Yi Chern; Victoria, Lin, Xi; Xiong, Caiming; Socher, Richard; Rajani, Nazneen Fatema

doi:10.48550/arxiv.2007.02871

Cited by 12 publications

(18 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…DART is an open-domain data-to-text dataset described in [27]. DART inputs are structured as sequences of ENTITY | RELATION | ENTITY triples.…”

Section: B Dataset Detailsmentioning

confidence: 99%

See 1 more Smart Citation

LoRA: Low-Rank Adaptation of Large Language Models

Hu,

Shen,

Wallis

et al. 2021

Preprint

250

201

View full text Add to dashboard Cite

The dominant paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, conventional fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example, deploying many independent instances of fine-tuned models, each with 175B parameters, is extremely expensive. We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. For GPT-3, LoRA can reduce the number of trainable parameters by 10,000 times and the computation hardware requirement by 3 times compared to full fine-tuning. LoRA performs on-par or better than fine-tuning in model quality on both GPT-3 and GPT-2, despite having fewer trainable parameters, a higher training throughput, and no additional inference latency. We also provide an empirical investigation into rank-deficiency in language model adaptations, which sheds light on the efficacy of LoRA. We release our implementation in GPT-2 at https://github.com/microsoft/LoRA. * Equal contribution. 2 While GPT-3 175B achieves non-trivial performance with few-shot learning, fine-tuning boosts its performance significantly as shown in Appendix A.Preprint. Under review.

show abstract

“…DART is an open-domain data-to-text dataset described in [27]. DART inputs are structured as sequences of ENTITY | RELATION | ENTITY triples.…”

Section: B Dataset Detailsmentioning

confidence: 99%

“…We also repeat our experiment on DART [27] and WebNLG [10] following the setup of [21]. The result is shown in Table 10.…”

Section: E Additional Task-based Experiments E1 Additional Experiment...mentioning

confidence: 99%

LoRA: Low-Rank Adaptation of Large Language Models

Hu,

Shen,

Wallis

et al. 2021

Preprint

250

201

View full text Add to dashboard Cite

show abstract

“…We use two open RDF-to-text generation datasets WebNLG 2 [22] and DART 3 [41] to evaluate our proposed model. Each example in the dataset is a (triples, text) pair and one triple collection can correspond to multiple ground truth texts.…”

Section: Datasetsmentioning

confidence: 99%

“…BLEU, METEOR and TER on DART test dataset. The results of End-to-End Transformer and Seq2Seq-Att are reported in[41].…”

mentioning

confidence: 99%

RDF-to-Text Generation with Graph-augmented Structural Neural Encoders

Gao

et al. 2020

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence

View full text Add to dashboard Cite

The task of RDF-to-text generation is to generate a corresponding descriptive text given a set of RDF triples. Most of the previous approaches either cast this task as a sequence-to-sequence problem or employ graph-based encoder for modeling RDF triples and decode a text sequence. However, none of these methods can explicitly model both local and global structure information between and within the triples. To address these issues, we propose to jointly learn local and global structure information via combining two new graph-augmented structural neural encoders (i.e., a bidirectional graph encoder and a bidirectional graph-based meta-paths encoder) for the input triples. Experimental results on two different WebNLG datasets show that our proposed model outperforms the state-of-the-art baselines. Furthermore, we perform a human evaluation that demonstrates the effectiveness of the proposed method by evaluating generated text quality using various subjective metrics.

show abstract

“…Data-to-text aims to generate natural language descriptions from the input structured data such as sport commentaries (Wiseman, Shieber, and Rush 2017). The structured data is usually represented as tables (Wiseman, Shieber, and Rush 2017;Thomson, Reiter, and Sripada 2020;Chen et al 2020), sets of table cells (Parikh et al 2020;Bao et al 2018), semantic representations (Novikova, Dušek, and Rieser 2017), or sets of relation triples (Gardent et al 2017;Nan et al 2020b). The task requires the model to select the salient information from the data, organize it in a logical order, and generate an accurate and fluent natural language description (Wiseman, Shieber, and Rush 2017).…”

Section: Introductionmentioning

confidence: 99%

Text-to-Table: A New Way of Information Extraction

Wu¹,

Zhang²,

Li³

2021

Preprint

View full text Add to dashboard Cite

We study a new problem setting of information extraction (IE), referred to as text-to-table, which can be viewed as an inverse problem of the well-studied table-to-text. In text-to-table , given a text, one creates a table or several tables expressing the main content of the text, while the model is learned from text-table pair data. The problem setting differs from those of the existing methods for IE. First, the extraction can be carried out from long texts to large tables with complex structures. Second, the extraction is entirely data-driven, and there is no need to explicitly define the schemas. As far as we know, there has been no previous work that studies the problem. In this work, we formalize text-to-table as a sequence-to-sequence (seq2seq) problem. We first employ a seq2seq model finetuned from a pre-trained language model to perform the task. We also develop a new method within the seq2seq approach, exploiting two additional techniques in table generation: table constraint and table relation embeddings. We make use of four existing table-to-text datasets in our experiments on textto-table. Experimental results show that the vanilla seq2seq model can outperform the baseline methods of using relation extraction and named entity extraction. The results also show that our method can further boost the performances of the vanilla seq2seq model. We further discuss the main challenges of the proposed task. The code and data will be made publicly available.

show abstract

DART: Open-Domain Structured Data Record to Text Generation

Cited by 12 publications

References 0 publications

LoRA: Low-Rank Adaptation of Large Language Models

LoRA: Low-Rank Adaptation of Large Language Models

RDF-to-Text Generation with Graph-augmented Structural Neural Encoders

Text-to-Table: A New Way of Information Extraction

Contact Info

Product

Resources

About