Here, we present the new UCL Bioinformatics Group’s PSIPRED Protein Analysis Workbench. The Workbench unites all of our previously available analysis methods into a single web-based framework. The new web portal provides a greatly streamlined user interface with a number of new features to allow users to better explore their results. We offer a number of additional services to enable computationally scalable execution of our prediction methods; these include SOAP and XML-RPC web server access and new HADOOP packages. All software and services are available via the UCL Bioinformatics Group website at http://bioinf.cs.ucl.ac.uk/.
BackgroundA major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging.ResultsWe conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2.ConclusionsThe top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-016-1037-6) contains supplementary material, which is available to authorized users.
Whole-genome sequencing (WGS) permits comprehensive cancer genome analyses, revealing mutational signatures, imprints of DNA damage, and repair processes that have arisen in each patient’s cancer. We performed mutational signature analyses on 12,222 whole-genome–sequenced tumor-normal matched pairs from patients recruited via the UK National Health Service (NHS). We contrasted our results with two independent cancer WGS datasets—from the International Cancer Genome Consortium (ICGC) and the Hartwig Medical Foundation (HMF)—involving 18,640 whole-genome–sequenced cancers in total. Our analyses add 40 single and 18 double substitution signatures to the current mutational signature tally. We show for each organ that cancers have a limited number of common signatures and a long tail of rare signatures, and we provide a practical solution for applying this concept of common versus rare signatures to future analyses.
SummaryLoss of cone photoreceptors, crucial for daylight vision, has the greatest impact on sight in retinal degeneration. Transplantation of stem cell-derived L/M-opsin cones, which form 90% of the human cone population, could provide a feasible therapy to restore vision. However, transcriptomic similarities between fetal and stem cell-derived cones remain to be defined, in addition to development of cone cell purification strategies. Here, we report an analysis of the human L/M-opsin cone photoreceptor transcriptome using an AAV2/9.pR2.1:GFP reporter. This led to the identification of a cone-enriched gene signature, which we used to demonstrate similar gene expression between fetal and stem cell-derived cones. We then defined a cluster of differentiation marker combination that, when used for cell sorting, significantly enriches for cone photoreceptors from the fetal retina and stem cell-derived retinal organoids, respectively. These data may facilitate more efficient isolation of human stem cell-derived cones for use in clinical transplantation studies.
Predicting protein function has been a major goal of bioinformatics for several decades, and it has gained fresh momentum thanks to recent community-wide blind tests aimed at benchmarking available tools on a genomic scale. Sequence-based predictors, especially those performing homology-based transfers, remain the most popular but increasing understanding of their limitations has stimulated the development of complementary approaches, which mostly exploit machine learning. Here we present FFPred 3, which is intended for assigning Gene Ontology terms to human protein chains, when homology with characterized proteins can provide little aid. Predictions are made by scanning the input sequences against an array of Support Vector Machines (SVMs), each examining the relationship between protein function and biophysical attributes describing secondary structure, transmembrane helices, intrinsically disordered regions, signal peptides and other motifs. This update features a larger SVM library that extends its coverage to the cellular component sub-ontology for the first time, prompted by the establishment of a dedicated evaluation category within the Critical Assessment of Functional Annotation. The effectiveness of this approach is demonstrated through benchmarking experiments, and its usefulness is illustrated by analysing the potential functional consequences of alternative splicing in human and their relationship to patterns of biological features.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.