BackgroundReceiver operating characteristic (ROC) curves are useful tools to evaluate classifiers in biomedical and bioinformatics applications. However, conclusions are often reached through inconsistent use or insufficient statistical analysis. To support researchers in their ROC curves analysis we developed pROC, a package for R and S+ that contains a set of tools displaying, analyzing, smoothing and comparing ROC curves in a user-friendly, object-oriented and flexible interface.ResultsWith data previously imported into the R or S+ environment, the pROC package builds ROC curves and includes functions for computing confidence intervals, statistical tests for comparing total or partial area under the curve or the operating points of different classifiers, and methods for smoothing ROC curves. Intermediary and final results are visualised in user-friendly interfaces. A case study based on published clinical and biomarker data shows how to perform a typical ROC analysis with pROC.ConclusionspROC is a package for R and S+ specifically dedicated to ROC analysis. It proposes multiple statistical tests to compare ROC curves, and in particular partial areas under the curve, allowing proper ROC interpretation. pROC is available in two versions: in the R programming language or with a graphical user interface in the S+ statistical software. It is accessible at http://expasy.org/tools/pROC/ under the GNU General Public License. It is also distributed through the CRAN and CSAN public repositories, facilitating its installation.
A new 6-plex isobaric mass tagging technology is presented, and proof of principle studies are carried out using standard protein mixtures and human cerebrospinal fluid (CSF) samples. The Tandem Mass Tags (TMT) comprise a set of structurally identical tags which label peptides on free amino-terminus and epsilon-amino functions of lysine residues. During MS/MS fragmentation, quantification information is obtained through the losses of the reporter ions. After evaluation of the relative quantification with the 6-plex version of the TMT on a model protein mixture at various concentrations, the quantification of proteins in CSF samples was performed using shotgun methods. Human postmortem (PM) CSF was taken as a model of massive brain injury and comparison was carried out with antemortem (AM) CSF. After immunoaffinity depletion, triplicates of AM and PM CSF pooled samples were reduced, alkylated, digested by trypsin, and labeled, respectively, with the six isobaric variants of the TMT (with reporter ions from m/z = 126.1 to 131.1 Th). The samples were pooled and fractionated by SCX chromatography. After RP-LC separation, peptides were identified and quantified by MS/MS analysis with MALDI TOF/TOF and ESI-Q-TOF. The concentration of 78 identified proteins was shown to be clearly increased in PM CSF samples compared to AM. Some of these proteins, like GFAP, protein S100B, and PARK7, have been previously described as brain damage biomarkers, supporting the PM CSF as a valid model of brain insult. ELISA for these proteins confirmed their elevated concentration in PM CSF. This work demonstrates the validity and robustness of the tandem mass tag (TMT) approach for quantitative MS-based proteomics.
Quantitative comparison of the protein content of biological samples is a fundamental tool of research. The TMT and iTRAQ isobaric labeling technologies allow the comparison of 2, 4, 6, or 8 samples in one mass spectrometric analysis. Sound statistical models that scale with the most advanced mass spectrometry (MS) instruments are essential for their efficient use. Through the application of robust statistical methods, we developed models that capture variability from individual spectra to biological samples. Classical experimental designs with a distinct sample in each channel as well as the use of replicates in multiple channels are integrated into a single statistical framework. We have prepared complex test samples including controlled ratios ranging from 100:1 to 1:100 to characterize the performance of our method. We demonstrate its application to actual biological data sets originating from three different laboratories and MS platforms. Finally, test data and an R package, named isobar, which can read Mascot, Phenyx, and mzIdentML files, are made available. The isobar package can also be used as an independent software that requires very little or no R programming skills.
This study supports the concept that modifications of the tear proteome can reflect biological abnormalities associated with multiple sclerosis and perhaps other inflammatory conditions affecting the CNS. In addition, alpha-1 antichymotrypsin elevation in tear fluid emerges as a promising biomarker for the diagnosis of multiple sclerosis.
Proteomic technologies, such as yeast twohybrid, mass spectrometry (MS), protein/peptide arrays and fluorescence microscopy, yield multi-dimensional data sets, which are often quite large and either not published or published as supplementary information that is not easily searchable. Without a system in place for standardizing and sharing data, it is not fruitful for the biomedical community to contribute these types of data to centralized repositories. Even more difficult is the annotation and display of pertinent information in the context of the corresponding proteins. Wikipedia, an online encyclopedia that anyone can edit, has already proven quite successful1 and can be used as a model for sharing biological data. However, the need for experimental evidence, data standardization and ownership of data creates scientific obstacles. Here, we describe Human Proteinpedia (http://www.humanproteinpedia.org/) as a portal that overcomes many of these obstacles to provide an integrated view of the human proteome. Human Proteinpedia also allows users to contribute and edit proteomic data with two significant differences from Wikipedia: first, the contributor is expected to provide experimental evidence for the data annotated; and second, only the original contributor can edit their data. Human Proteinpedia's annotation system provides investigators with multiple options for contributing data including web forms and annotation servers. Although registration is required to contribute data, anyone can freely access the data in the repository. The web forms simplify submission through the use of pull-down menus for certain data fields and pop-up menus for standardized vocabulary terms. Distributed annotation servers using modified protein DAS (distributed annotation system) protocols developed by us (DAS protocols were originally developed for sharing mRNA and DNA data) permit contributing laboratories to maintain protein annotations locally. All protein annotations are visualized in the context of corresponding proteins in the Human Protein Reference Database (HPRD)3. Figure 1 shows tissue expression data for alpha-2-HS glycoprotein derived from three different types of experiments. Our unique effort differs significantly from existing repositories, such as PeptideAtlas and PRIDE5 in several respects. First, most proteomic repositories are restricted to one or two experimental platforms, whereas Human Proteinpedia can accommodate data from diverse platforms, including yeast two-hybrid screens, MS, peptide/protein arrays, immunohistochemistry, western blots, coimmunoprecipitation and fluorescence microscopy-type experiments. Second, Human Proteinpedia allows contributing laboratories to annotate data pertaining to six features of proteins (posttranslational modifications, tissue expression, cell line expression, subcellular localization, enzyme substrates and protein-protein interactions;). No existing repository currently permits annotation of all these features in proteins. Third, all data submitted to Human Proteinpedia...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.