2018
DOI: 10.7287/peerj.preprints.27028
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Comparison of natural language processing tools for automatic gene ontology annotation of scientific literature

Abstract: Abstract-Manual curation of scientific literature for ontologybased knowledge representation has proven infeasible and unscalable to the large and growing volume of scientific literature. Automated annotation solutions that leverage text mining and Natural Language Processing (NLP) have been developed to ameliorate the problem of literature curation. These NLP approaches use parsing, syntactical, and lexical analysis of text to recognize and annotate pieces of text with ontology concepts. Here, we conduct a co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
1

Relationship

3
2

Authors

Journals

citations
Cited by 5 publications
(10 citation statements)
references
References 21 publications
0
10
0
Order By: Relevance
“…CNNs were also used for biomedical named entity recognition combined with n-gram character embeddings resulting in enhanced performance in a comparison with other deep learning models [25]. A comprehensive review of deep learning methods for named entity recognition can be found in [13] and a comparison of existing text mining tools in [3].…”
Section: Related Workmentioning
confidence: 99%
“…CNNs were also used for biomedical named entity recognition combined with n-gram character embeddings resulting in enhanced performance in a comparison with other deep learning models [25]. A comprehensive review of deep learning methods for named entity recognition can be found in [13] and a comparison of existing text mining tools in [3].…”
Section: Related Workmentioning
confidence: 99%
“…To the best of our knowledge, there is no "gold standard" for the annotation rate. Various studies conducting semantic annotation in various domains, such as Bada, Vasilevsky, Haendel, and Hunter (2016)) and Beasley and Manda (2018)) in bioinformatics, Lévy, Tomeh, and Ma (2014))…”
Section: Step 3: Evaluate Annotation Resultsmentioning
confidence: 99%
“…To the best of our knowledge, there is no “gold standard” for the annotation rate. Various studies conducting semantic annotation in various domains, such as Bada, Vasilevsky, Haendel, and Hunter ()) and Beasley and Manda ()) in bioinformatics, Lévy, Tomeh, and Ma ()) and Fiorelli, Pazienza, and Stellato ()) in annotation tool development, Jadidinejad, Mahmoudi, and Meybodi ()) in document classification, and Ali et al () in knowledge engineering; try to maximize the number of words annotated but do not report any number for annotation rate. We report on our findings regarding the annotation rate in Section 5.4.…”
Section: A Methods For Text Coherence Measurementmentioning
confidence: 99%
“…The large majority of text mining approaches for recognizing ontology concepts from text either rely on lexical and syntactic analysis of text in addition to machine learning (Cui et al, ; Jonquet et al, ; Manda, Beasley, & Mohanty, ; Mungall et al, ). Beasley and Manda () recently conducted a comparison of a number of text mining tools at annotating biological literature with GO terms.…”
Section: Text Miningmentioning
confidence: 99%