Mass-spectrometry-based proteomics has become an important component of biological research. Numerous proteomics methods have been developed to identify and quantify the proteins in biological and clinical samples1, identify pathways affected by endogenous and exogenous perturbations2, and characterize protein complexes3. Despite successes, the interpretation of vast proteomics datasets remains a challenge. There have been several calls for improvements and standardization of proteomics data analysis frameworks, as well as for an application-programming interface for proteomics data access4,5. In response, we have developed the ProteoWizard Toolkit, a robust set of open-source, software libraries and applications designed to facilitate proteomics research. The libraries implement the first-ever, non-commercial, unified data access interface for proteomics, bridging field-standard open formats and all common vendor formats. In addition, diverse software classes enable rapid development of vendor-agnostic proteomics software. Additionally, ProteoWizard projects and applications, building upon the core libraries, are becoming standard tools for enabling significant proteomics inquiries.
Autism Genes, Again and Again
Despite recent advances in sequencing technologies and their lowered costs—effective, highly sensitive, and specific sequencing of multiple genes of interest from large cohorts remains expensive.
O'Roak
et al.
(p.
1619
; published online 15 November) modified molecular inversion probe methods for target-specific capture and sequencing to resequence candidate genes in thousands of patients. The technique was applied to 44 candidate genes to identify de novo mutations in a large cohort of individuals with and without autism spectrum disorder. The analysis revealed several de novo mutations in genes that together contribute to 1% of sporadic autism spectrum disorders, supporting the notion that multiple genes underlie autism-spectrum disorders.
Data independent acquisition (DIA) mass spectrometry is a powerful technique that is improving the reproducibility and throughput of proteomics studies. Here, we introduce an experimental workflow that uses this technique to construct chromatogram libraries that capture fragment ion chromatographic peak shape and retention time for every detectable peptide in a proteomics experiment. These coordinates calibrate protein databases or spectrum libraries to a specific mass spectrometer and chromatography setup, facilitating DIA-only pipelines and the reuse of global resource libraries. We also present EncyclopeDIA, a software tool for generating and searching chromatogram libraries, and demonstrate the performance of our workflow by quantifying proteins in human and yeast cells. We find that by exploiting calibrated retention time and fragmentation specificity in chromatogram libraries, EncyclopeDIA can detect 20–25% more peptides from DIA experiments than with data dependent acquisition-based spectrum libraries alone.
In mass spectrometry based proteomics, data-independent acquisition (DIA) strategies have the ability to acquire a single dataset useful for identification and quantification of detectable peptides in a complex mixture. Despite this, DIA is often overlooked due to noisier data resulting from a typical five to ten fold reduction in precursor selectivity compared to data dependent acquisition or selected reaction monitoring. We demonstrate a multiplexing technique which improves precursor selectivity five-fold.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.