2022
DOI: 10.15514/ispras-2022-34(2)-7
|View full text |Cite
|
Sign up to set email alerts
|

Text sampling strategies for predicting missing bibliographic links

Abstract: The paper proposes various strategies for sampling text data when performing automatic sentence classification for the purpose of detecting missing bibliographic links. We construct samples based on sentences as semantic units of the text and add their immediate context which consists of several neighbouring sentences. We examine a number of sampling strategies that differ in context size and position. The experiment is carried out on the collection of STEM scientific papers. Including the context of sentences… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 24 publications
0
1
0
Order By: Relevance
“…Citation Generation Although LMs, particularly those intended to produce scientific text, such as Meta's Galactica (Taylor et al, 2022), already produce text that looks as if it is a citation, frequently there is no document corresponding to the apparent citation or the cited document does not support the statement associated with it. Many existing approaches to citation recommendation offer productive avenues to explore for factuality testing, post-hoc generation of support, hybrid architectures, or creation of training data (Ali et al, 2022;Krasnova et al, 2023). There has also been work on citation generation, where the task is either: 1) given two documents, generate an explanation for the relation between them (Luu et al, 2020), or 2) generate a citation for an already existing text (Gu and Hahnloser, 2022;Xing et al, 2020;Wu et al, 2021;Fetahu et al, 2016).…”
Section: Introductionmentioning
confidence: 99%
“…Citation Generation Although LMs, particularly those intended to produce scientific text, such as Meta's Galactica (Taylor et al, 2022), already produce text that looks as if it is a citation, frequently there is no document corresponding to the apparent citation or the cited document does not support the statement associated with it. Many existing approaches to citation recommendation offer productive avenues to explore for factuality testing, post-hoc generation of support, hybrid architectures, or creation of training data (Ali et al, 2022;Krasnova et al, 2023). There has also been work on citation generation, where the task is either: 1) given two documents, generate an explanation for the relation between them (Luu et al, 2020), or 2) generate a citation for an already existing text (Gu and Hahnloser, 2022;Xing et al, 2020;Wu et al, 2021;Fetahu et al, 2016).…”
Section: Introductionmentioning
confidence: 99%