2022
DOI: 10.1093/database/baac047
|View full text |Cite
|
Sign up to set email alerts
|

Chemical identification and indexing in PubMed full-text articles using deep learning and heuristics

Abstract: The identification of chemicals in articles has attracted a large interest in the biomedical scientific community, given its importance in drug development research. Most of previous research have focused on PubMed abstracts, and further investigation using full-text documents is required because these contain additional valuable information that must be explored. The manual expert task of indexing Medical Subject Headings (MeSH) terms to these articles later helps researchers find the most relevant publicatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(2 citation statements)
references
References 77 publications
0
2
0
Order By: Relevance
“…For diseases and chemicals, we include in the RBES category two systems which are only “partly” rule-based (stretching our definition), as they better represent the state of the art of disease/chemical-specific models. We use “TaggerOne” ( Leaman and Lu 2016 ), a semi-Markov model, for diseases, and opt for the system that won the BioCreative VII NLM-Chem track ( Almeida et al 2022 ) for chemicals (“BC7T2W”), which uses both string matching and neural embeddings. To the best of our knowledge there exists no linking approach specific for cell lines.…”
Section: Methodsmentioning
confidence: 99%
“…For diseases and chemicals, we include in the RBES category two systems which are only “partly” rule-based (stretching our definition), as they better represent the state of the art of disease/chemical-specific models. We use “TaggerOne” ( Leaman and Lu 2016 ), a semi-Markov model, for diseases, and opt for the system that won the BioCreative VII NLM-Chem track ( Almeida et al 2022 ) for chemicals (“BC7T2W”), which uses both string matching and neural embeddings. To the best of our knowledge there exists no linking approach specific for cell lines.…”
Section: Methodsmentioning
confidence: 99%
“…Various statistical model-based NER algorithms have also been proposed, often as a sequence labeling problem where the tokens in a sentence are assigned most likely tags based on token features. A popular strategy is the use of conditional random fields 11 in combination with expertselected features 12 or contextualized word embeddings from neural networks (recurrent networks [13][14][15] , or transformers [16][17][18][19] ).…”
Section: Introductionmentioning
confidence: 99%