2015
DOI: 10.1093/database/bav089
|View full text |Cite
|
Sign up to set email alerts
|

SORTA: a system for ontology-based re-coding and technical annotation of biomedical phenotype data

Abstract: There is an urgent need to standardize the semantics of biomedical data values, such as phenotypes, to enable comparative and integrative analyses. However, it is unlikely that all studies will use the same data collection protocols. As a result, retrospective standardization is often required, which involves matching of original (unstructured or locally coded) data to widely used coding or ontology systems such as SNOMED CT (clinical terms), ICD-10 (International Classification of Disease) and HPO (Human Phen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
5
2
2
1

Relationship

1
9

Authors

Journals

citations
Cited by 34 publications
(31 citation statements)
references
References 18 publications
0
31
0
Order By: Relevance
“…Furthermore, the processing and implementation of algorithms through Opal, support the transformation of study specific variables into common formats. Another framework, namely the System for Ontology-based Re-coding and Technical Annotation (SORTA) [49], has been developed for the annotation of biomedical and phenotype data. SORTA overcomes the problem of semantic heterogeneity and matches original data values to a target scheme.…”
Section: Definition Of the Target Variables And Data Processingmentioning
confidence: 99%
“…Furthermore, the processing and implementation of algorithms through Opal, support the transformation of study specific variables into common formats. Another framework, namely the System for Ontology-based Re-coding and Technical Annotation (SORTA) [49], has been developed for the annotation of biomedical and phenotype data. SORTA overcomes the problem of semantic heterogeneity and matches original data values to a target scheme.…”
Section: Definition Of the Target Variables And Data Processingmentioning
confidence: 99%
“…HPO terms were directly matched to UKBB phenotypes when phenotypes in both systems had similar terminology. The direct phenotype matching was conducted using a semi-automatic mapping system which combines semantic and lexical similarity between word [32] followed by manual curation. When the HPO terms were not present, we performed an indirect matching by hand to find in the UKBB, the phenotype that best reflects the target HPO terms.…”
Section: Phenotypes Of Target Syndromesmentioning
confidence: 99%
“…For example, the Biobanking and Rare disease communities will be given end-user tools that utilize/generate such FAIR infrastructures to: guide discovery by researchers; help both biobankers and researchers to re-code their data to standard ontologies building on the SORTA system (Pang et al, 2015); assist to extend the MOLGENIS/BiobankConnect system (Pang et al, 2016); add FAIR interfaces to the BBMRI (Biobanking and BioMolecular resources Research Infrastructure) and RD-connect national and European biobank data and sample catalogues. There are also a core group of FAIR infrastructure authors who are creating large-scale indexing and discovery systems that will facilitate the automated identification and retrieval of relevant information, from any repository, in response to end-user queries, portending a day when currently unused-''lost''-data deposits once again provide return-on-investment through their discovery and reuse.…”
Section: Incentives and Barriers To Implementationmentioning
confidence: 99%