2022
DOI: 10.1609/aaai.v36i10.21286
|View full text |Cite
|
Sign up to set email alerts
|

Enhanced Story Comprehension for Large Language Models through Dynamic Document-Based Knowledge Graphs

Abstract: Large transformer-based language models have achieved incredible success at various tasks which require narrative comprehension, including story completion, answering questions about stories, and generating stories ex nihilo. However, due to the limitations of finite context windows, these language models struggle to produce or understand stories longer than several thousand tokens. In order to mitigate the document length limitations that come with finite context windows, we introduce a novel architecture tha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 21 publications
(33 reference statements)
0
2
0
Order By: Relevance
“…Benchmarking LLMs has been an important research theme in the field of natural language processing, focusing on evaluating their performance across a variety of tasks and datasets, when comprehensive benchmarks provided critical insights into the strengths and limitations of different LLM architectures [28], [29]. Comparative analysis of model performance on established benchmarks such as GLUE (General Language Understanding Evaluation) and SQuAD (Stanford Question Answering Dataset) demonstrated significant advancements in language understanding capabilities [30]- [34]. Performance metrics were used to quantify improvements in tasks such as text classification, sentiment analysis, and question answering, highlighting the progressive enhancements in model architectures and training methodologies [35], [36].…”
Section: B Benchmarking Large Language Modelsmentioning
confidence: 99%
“…Benchmarking LLMs has been an important research theme in the field of natural language processing, focusing on evaluating their performance across a variety of tasks and datasets, when comprehensive benchmarks provided critical insights into the strengths and limitations of different LLM architectures [28], [29]. Comparative analysis of model performance on established benchmarks such as GLUE (General Language Understanding Evaluation) and SQuAD (Stanford Question Answering Dataset) demonstrated significant advancements in language understanding capabilities [30]- [34]. Performance metrics were used to quantify improvements in tasks such as text classification, sentiment analysis, and question answering, highlighting the progressive enhancements in model architectures and training methodologies [35], [36].…”
Section: B Benchmarking Large Language Modelsmentioning
confidence: 99%
“…Many studies [1,2,8,10,13,28,29,32,34,38] have been proposed to incorporate external knowledge to better understand the text or generate the expected output. For example, on the language understanding task, [13] injects expanded knowledge into the language model by adding the entity and relation from the knowledge graph as additional words.…”
Section: Knowledge Informed Language Understanding and Generationmentioning
confidence: 99%
“…Different from the masking strategy of BERT [5], [29] proposes an entity-level masking strategy to incorporate the informative entities into the language model. [2] verbalize extracted facts aligning with input questions as natural language and incorporate them as prompts to the language model to improve story comprehension. For open-domain question answering, [8] incorporate the informative entities extracted from the input question and passage with the output of language model T5 to jointly optimize the knowledge representations based on their proposed relation-aware GNN.…”
Section: Knowledge Informed Language Understanding and Generationmentioning
confidence: 99%
“…With the continual advancement in graph neural networks, graph-based neural network models have garnered significant attention in domains such as natural language processing and computer vision [20,21]. Particularly, graph models can use correlations between data to construct edges in a graph to extract structural information from the data [22,23].…”
Section: Introductionmentioning
confidence: 99%