Literary runaway: Increasingly more references cited per academic research article from 1980 to 2019

Dai, Chaoyang; Quan, Chen; Wan, Tao; Liu, Fan; Gong, Ya‐Jun; Wang, Qingfeng

doi:10.1371/journal.pone.0255849

Cited by 7 publications

(2 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We have to read more now, for example, or at least pretend to. So reference lists have grown longer, with a study of 30 disciplines in the Web of Science averaging 29 per paper in 2003 and 45 in 2019, with no sign of levelling off (Dai et al, 2021). More seriously, the tools which allow us to create, access and distribute our work also enable others to assess it, producing a relentless and unforgiving numerically-driven assessment culture.…”

Section: Some Final Thoughtsmentioning

confidence: 99%

Open Science: What’s not to like?

Hyland

2023

Ibérica

View full text Add to dashboard Cite

Section: Some Final Thoughtsmentioning

confidence: 99%

Open Science: What’s not to like?

Hyland

2023

Ibérica

View full text Add to dashboard Cite

“…Citation integrity errors are difficult to detect, because they require the reader, editor, or peer reviewer to be very familiar with the cited article and be able to judge whether the cited information is consistent with the original text. Given that an article on average cites about 45 articles ( Dai et al 2021 ) and the judgment requires expertise, this is a challenging task. In this article, we pose the question of whether natural language processing (NLP) techniques can help address the problem of inaccurate citations in a scalable manner.…”

Section: Introductionmentioning

confidence: 99%

Assessing citation integrity in biomedical publications: corpus annotation and NLP models

Sarol,

Ming,

Radhakrishna

et al. 2024

Bioinformatics

View full text Add to dashboard Cite

Motivation Citations have a fundamental role in scholarly communication and assessment. Citation accuracy and transparency is crucial for the integrity of scientific evidence. In this work, we focus on quotation errors, errors in citation content that can distort the scientific evidence and that are hard to detect for humans. We construct a corpus and propose natural language processing (NLP) methods to identify such errors in biomedical publications. Results We manually annotated 100 highly-cited biomedical publications (reference articles) and citations to them. The annotation involved labeling citation context in the citing article, relevant evidence sentences in the reference article, and the accuracy of the citation. A total of 3063 citation instances were annotated (39.18% with accuracy errors). For NLP, we combined a sentence retriever with a fine-tuned claim verification model to label citations as ACCURATE, NOT_ACCURATE, or IRRELEVANT. We also explored few-shot in-context learning with generative large language models. The best performing model—which uses citation sentences as citation context, the BM25 model with MonoT5 reranker for retrieving top-20 sentences, and a fine-tuned MultiVerS model for accuracy label classification—yielded 0.59 micro-F1 and 0.52 macro-F1 score. GPT-4 in-context learning performed better in identifying accurate citations, but it lagged for erroneous citations (0.65 micro-F1, 0.45 macro-F1). Citation quotation errors are often subtle, and it is currently challenging for NLP models to identify erroneous citations. With further improvements, the models could serve to improve citation quality and accuracy. Availability and implementation We make the corpus and the best-performing NLP model publicly available at https://github.com/ScienceNLP-Lab/Citation-Integrity/.

show abstract

Assessment of subject-normalized comprehensiveness of research-intensive universities

Mendes

2024

Scientometrics

View full text Add to dashboard Cite

Literary runaway: Increasingly more references cited per academic research article from 1980 to 2019

Cited by 7 publications

References 20 publications

Open Science: What’s not to like?

Open Science: What’s not to like?

Assessing citation integrity in biomedical publications: corpus annotation and NLP models

Assessment of subject-normalized comprehensiveness of research-intensive universities

Contact Info

Product

Resources

About