2015
DOI: 10.7287/peerj.preprints.807
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Toward synthesizing our knowledge of morphology: using ontologies and machine reasoning to extract presence/absence evolutionary phenotypes across studies

Abstract: The reality of larger and larger molecular databases and the need to integrate data scalably have presented a major challenge for the use of phenotypic data. Morphology is currently primarily described in discrete publications, entrenched in non-computer readable text, and requires enormous investments of time and resources to integrate across large numbers of taxa and studies. Here we present a new methodology, using ontology-based reasoning systems working with the Phenoscape Knowledgebase (KB), to automatic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2015
2015
2019
2019

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 0 publications
0
4
0
Order By: Relevance
“…Processing and analyzing new collections of taxonomic descriptions from across the tree of life with the NLP pipeline leads us to discover new expressions and grammatical constructions that help to optimize the NLP pipeline and its components, like the Plant Glossary. In the future, the ETC pipeline will allow users to import and create ontologies, hierarchical organizations of terms that establish relationships among structures, entities, and qualities and enable the computer to have reasoning capabilities (Dececchi et al, 2015). Using ontologies will likely increase the number of usable phenotypic characters obtained by the NLP pipeline.…”
Section: Discussionmentioning
confidence: 99%
“…Processing and analyzing new collections of taxonomic descriptions from across the tree of life with the NLP pipeline leads us to discover new expressions and grammatical constructions that help to optimize the NLP pipeline and its components, like the Plant Glossary. In the future, the ETC pipeline will allow users to import and create ontologies, hierarchical organizations of terms that establish relationships among structures, entities, and qualities and enable the computer to have reasoning capabilities (Dececchi et al, 2015). Using ontologies will likely increase the number of usable phenotypic characters obtained by the NLP pipeline.…”
Section: Discussionmentioning
confidence: 99%
“…One such example is the Phenoscaping project (http://kb.phenoscape.org; Deans et al 2015), and related efforts in the Vertebrate Taxonomy Ontogeny (Midford et al 2013) and Hymenoptera Anatomy Ontology (Yoder et al 2010), which require large amounts of researcher effort to collate. Other approaches include using machine learning (Dececchi et al 2015), machine vision (Corney et al 2012a, b) or natural language processing (Cui 2012) to identify or infer phenotypes. These statistical techniques function ideally with either a large training data set (e.g., a predefined ontogeny data base) or a complex model (Brill 2003;Halevy, Norvig & Pereira 2009;Hastie, Tibshirani & Friedman 2009), both of which also require intensive researcher effort to build and validate.…”
Section: Introductionmentioning
confidence: 99%
“…One such example is the Phenoscaping project (http://kb.phenoscape.org; Deans et al 2015), and related efforts in the Vertebrate Taxonomy Ontogeny (Midford et al 2013) and Hymenoptera Anatomy Ontology (Yoder et al 2010), which require large amounts of researcher effort to collate. Other approaches include using machine learning (Dececchi et al 2015), machine vision (Corney et al 2012a; b), or natural language processing (Cui 2012) to identify or infer phenotypes. These statistical techniques function ideally with either a large training dataset (e.g., a predefined ontogeny database) or a complex model (Brill 2003; Halevy et al 2009; Hastie et al 2009), both of which also require intensive researcher effort to build and validate.…”
Section: Introductionmentioning
confidence: 99%