The effects of shared information on semantic calculations in the gene ontology

Bible, Paul W.; Sun, Hongwei; Morasso, María I.; Loganantharaj, Rasiah; Wei, Lai

doi:10.1016/j.csbj.2017.01.009

Cited by 4 publications

(6 citation statements)

References 47 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Retrieving, reproducing and reusing SS scores for any ontology in any application is still challenging 39,44,54 . This mainly due to the lack of a tool that exhaustively implements existing SS models and related assumptions to produce consistent scores on demand and in real-time for use in related applications and for testing hypotheses.…”

Section: Assessing Ss Score Integritymentioning

confidence: 99%

“…Symbols of different measures used in some existing tools and PySML are shown in Supplementary File 1 (see Table 1) and we refer readers to Supplementary File 2 (see Appendix 2) where complete descriptions and algebraic forms of all these measures are provided. Best performance often results from trading between accuracy and computational speed, PySML implements all these measures, except those which are known to be computationally unattractive with high disproportion between computational complexity and performance improvement, e.g., measures built on graph-based similarity measure (GRASM) 12,54 . However, PySML offers a platform that may be used to easily develop, test and assess any measures, so that users who are interested in these measures, for example, can process them within the platform.…”

Section: Different Existing Semantic Similarity Measuresmentioning

confidence: 99%

“…The high scores produced by SML is mainly caused by the contribution of the root of the ontology in SS computation as proteins sharing only the ontology root have a score greater than 0. This suggests that SML overestimates SS scores by considering the root as an informative ontology concept, which ultimately biases these scores 38 , thus negatively impacting the performance of these SS models 38,54 .…”

Section: Assessing Ss Score Integritymentioning

confidence: 99%

See 2 more Smart Citations

An Integrated Platform Supporting Semantic Similarity Score Calculation and Reproducibility

Mazandu

Opap

Makinde

et al. 2021

Preprint

View full text Add to dashboard Cite

During the last decade, we witnessed an exponential rise of datasets from heterogeneous sources. Ontologies are playing an essential role in consistently describing domain concepts, data harmonization and integration to support large-scale integrative analysis and semantic interoperability in knowledge sharing. Several semantic similarity (SS) measures have been suggested to enable the integration of rich ontology structures into automated reasoning and inference. However, there is no tool that exhaustively implements these measures and existing tools are generally Gene Ontology specific, do not implement several models suggested in the WordNet context and are not equipped to properly deal with frequent ontology updates. We introduce a Python SS measure library (PySML), which tackles issues related to current SS tools, providing a portable and expandable tool to a broad computational audience. This empowers users to manipulate SS scores from several applications for any ontology version and file format. PySML is a flexible tool enabling the implementation of all existing semantic similarity models, resolving issues related to computation, reproducibility and re-usability of SS scores.

show abstract

Section: Assessing Ss Score Integritymentioning

confidence: 99%

Section: Different Existing Semantic Similarity Measuresmentioning

confidence: 99%

Section: Assessing Ss Score Integritymentioning

confidence: 99%

See 1 more Smart Citation

An Integrated Platform Supporting Semantic Similarity Score Calculation and Reproducibility

Mazandu

Opap

Makinde

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The effects of the shared information for the semantic similarity calculation were discussed in [ 41 ]. The shared information of a term pair is the common inheritance relations extracted from the structure of the GO graph.…”

Section: Introductionmentioning

confidence: 99%

“…Experiments of three different methods calculating the term similarity, each with five shared information methods, were done on three ontologies across six benchmarks. Among the choice of shared information, term similarity algorithm, and ontology type, the choice of ontology type most strongly influenced the performance, and shared information type had the least influence [ 41 ].…”

Section: Introductionmentioning

confidence: 99%

An improved approach to infer protein-protein interaction based on a hierarchical vector space model

Zhang

Jia

et al. 2018

BMC Bioinformatics

View full text Add to dashboard Cite

BackgroundComparing and classifying functions of gene products are important in today’s biomedical research. The semantic similarity derived from the Gene Ontology (GO) annotation has been regarded as one of the most widely used indicators for protein interaction. Among the various approaches proposed, those based on the vector space model are relatively simple, but their effectiveness is far from satisfying.ResultsWe propose a Hierarchical Vector Space Model (HVSM) for computing semantic similarity between different genes or their products, which enhances the basic vector space model by introducing the relation between GO terms. Besides the directly annotated terms, HVSM also takes their ancestors and descendants related by “is_a” and “part_of” relations into account. Moreover, HVSM introduces the concept of a Certainty Factor to calibrate the semantic similarity based on the number of terms annotated to genes. To assess the performance of our method, we applied HVSM to Homo sapiens and Saccharomyces cerevisiae protein-protein interaction datasets. Compared with TCSS, Resnik, and other classic similarity measures, HVSM achieved significant improvement for distinguishing positive from negative protein interactions. We also tested its correlation with sequence, EC, and Pfam similarity using online tool CESSM.ConclusionsHVSM showed an improvement of up to 4% compared to TCSS, 8% compared to IntelliGO, 12% compared to basic VSM, 6% compared to Resnik, 8% compared to Lin, 11% compared to Jiang, 8% compared to Schlicker, and 11% compared to SimGIC using AUC scores. CESSM test showed HVSM was comparable to SimGIC, and superior to all other similarity measures in CESSM as well as TCSS. Supplementary information and the software are available at https://github.com/kejia1215/HVSM.

show abstract

Investigating changes of proteome in the bovine milk serum after retort processing using proteomics techniques

Wei

Kang

Liao

et al. 2021

Food Science & Nutrition

View full text Add to dashboard Cite

The objective of this study was to investigate the changes of the proteins in bovine milk serum after retort processing by label‐free quantification proteomics techniques. A total of 96 and 106 proteins were quantified in control group (CG) and retort group (RG), respectively. Hierarchical clustering analysis of the identified milk serum proteins showed a decrease in the abundance of most proteins, such as serum albumin, lactoperoxidase, lactotransferrin, and complement C3, and an increase in the abundance of other proteins such as κ‐casein, lipocalin 2, and Perilipin. Student's t‐test showed 21 proteins significantly differential abundance between CG and RG (p < .05), of which intensity‐based absolute quantification (iBAQ) of five proteins decreased and iBAQ of 16 proteins increased. Bioinformatics analysis demonstrated that retort processing increased the digestibility of proteins, but this improvement was offset by a decrease in the digestibility of proteins caused by protein modification. Our results provide insight into the proteome of retort sterilized milk for the first time. Given the extremely high security of retort sterilized milk, the proteome of bovine milk serum changes after retort sterilization exposed in this study will contribute to the formula design of retort sterilized milk products.

show abstract

The effects of shared information on semantic calculations in the gene ontology

Cited by 4 publications

References 47 publications

An Integrated Platform Supporting Semantic Similarity Score Calculation and Reproducibility

An Integrated Platform Supporting Semantic Similarity Score Calculation and Reproducibility

An improved approach to infer protein-protein interaction based on a hierarchical vector space model

Investigating changes of proteome in the bovine milk serum after retort processing using proteomics techniques

Contact Info

Product

Resources

About