2018
DOI: 10.1186/s13326-018-0181-1
|View full text |Cite
|
Sign up to set email alerts
|

Deep learning meets ontologies: experiments to anchor the cardiovascular disease ontology in the biomedical literature

Abstract: BackgroundAutomatic identification of term variants or acceptable alternative free-text terms for gene and protein names from the millions of biomedical publications is a challenging task. Ontologies, such as the Cardiovascular Disease Ontology (CVDO), capture domain knowledge in a computational form and can provide context for gene/protein names as written in the literature. This study investigates: 1) if word embeddings from Deep Learning algorithms can provide a list of term variants for a given gene/protei… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
17
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 34 publications
(22 citation statements)
references
References 64 publications
0
17
0
Order By: Relevance
“…This study used a subset of PubMed systematic reviews [ 16 ] of 301,201 PubMed/MEDLINE publications (titles and available abstracts), called the PubMed systematic reviews subset (PMSB dataset). The preprocessing of the input text for the PMSB dataset and the hyperparameter configuration for Skip-gram and CBOW are identical to those in our previous study [ 41 ] and detailed in the study by Arguello Casteleiro et al [ 42 ].…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…This study used a subset of PubMed systematic reviews [ 16 ] of 301,201 PubMed/MEDLINE publications (titles and available abstracts), called the PubMed systematic reviews subset (PMSB dataset). The preprocessing of the input text for the PMSB dataset and the hyperparameter configuration for Skip-gram and CBOW are identical to those in our previous study [ 41 ] and detailed in the study by Arguello Casteleiro et al [ 42 ].…”
Section: Methodsmentioning
confidence: 99%
“…We kept only the 12 top-ranked candidate n-grams y for the 3CosAdd formula, that is, the 12-candidate y with CBOW and Skip-gram embeddings yielding the highest 3CosAdd values. We limited the list of candidates to 12, similar to Arguello Casteleiro et al [ 42 ], and following cognitive theories like Novak JD and Cañas AJ [ 44 ].…”
Section: Methodsmentioning
confidence: 99%
“…We apply Gibbs's sampling [32] to estimate the parameters and inference. We used the gensim Python libraries, 4 and the R implementations of LDA and LDAvis [23] for our purposes.…”
Section: Topic Modeling Implementationmentioning
confidence: 99%
“…Natural language processing techniques are useful for the extraction of information for further processing, such as ranking of terms in order to discover new concepts like phenotypic disease characterization [3]. Recent works use deep learning techniques to anchor a specific semantic ontology in the relevant literature [4]. A very promising application of medical literature is the discovery of new relations between concepts that may lead to breakthrough treatments [5].…”
Section: Introductionmentioning
confidence: 99%
“…The first impactful work on developing word embedding representation for large-scale biomedical corpora was performed by Pysallo et al where they trained so-called Word2Vec models (2) on a collection of large-scale biomedical technical corpora made of the entire PubMed database, the open access collection from PubMed Central (PMC) and Wikipedia (3). This paper has a disproportionately large impact (in terms of citations) due largely to the fact that the authors shared the representations online as open access data, enabling other researchers to build effective neural network systems for a range of tasks [see (4–6) for examples]. A variety of different computational methods for generating word representations have subsequently been developed.…”
Section: Introductionmentioning
confidence: 99%