The availability of human genome sequence has transformed biomedical research over the past decade. However, an equivalent map for the human proteome with direct measurements of proteins and peptides does not exist yet. Here, we present a draft map of the human proteome using high resolution Fourier transform mass spectrometry. In-depth proteomic profiling of 30 histologically normal human samples including 17 adult tissues, 7 fetal tissues and 6 purified primary hematopoietic cells resulted in identification of proteins encoded by 17,294 genes accounting for ~84% of the total annotated protein-coding genes in humans. A unique and comprehensive strategy for proteogenomic analysis enabled us to discover a number of novel protein-coding regions, which includes translated pseudogenes, non-coding RNAs and upstream ORFs. This large human proteome catalog (available as an interactive web-based resource at http://www.humanproteomemap.org) will complement available human genome and transcriptome data to accelerate biomedical research in health and disease.
Gingivo-buccal oral squamous cell carcinoma (OSCC-GB), an anatomical and clinical subtype of head and neck squamous cell carcinoma (HNSCC), is prevalent in regions where tobacco-chewing is common. Exome sequencing (n=50) and recurrence testing (n=60) reveals that some significantly and frequently altered genes are specific to OSCC-GB (USP9X, MLL4, ARID2, UNC13C and TRPM3), while some others are shared with HNSCC (for example, TP53, FAT1, CASP8, HRAS and NOTCH1). We also find new genes with recurrent amplifications (for example, DROSHA, YAP1) or homozygous deletions (for example, DDX3X) in OSCC-GB. We find a high proportion of C>G transversions among tobacco users with high numbers of mutations. Many pathways that are enriched for genomic alterations are specific to OSCC-GB. Our work reveals molecular subtypes with distinctive mutational profiles such as patients predominantly harbouring mutations in CASP8 with or without mutations in FAT1. Mean duration of disease-free survival is significantly elevated in some molecular subgroups. These findings open new avenues for biological characterization and exploration of therapies.
MicroRNAs are short RNAs that serve as regulators of gene expression and are essential components of normal development as well as modulators of disease. MicroRNAs generally act cell-autonomously, and thus their localization to specific cell types is needed to guide our understanding of microRNA activity. Current tissue-level data have caused considerable confusion, and comprehensive cell-level data do not yet exist. Here, we establish the landscape of human cell-specific microRNA expression. This project evaluated 8 billion small RNA-seq reads from 46 primary cell types, 42 cancer or immortalized cell lines, and 26 tissues. It identified both specific and ubiquitous patterns of expression that strongly correlate with adjacent superenhancer activity. Analysis of unaligned RNA reads uncovered 207 unknown minor strand (passenger) microRNAs of known microRNA loci and 495 novel putative microRNA loci. Although cancer cell lines generally recapitulated the expression patterns of matched primary cells, their isomiR sequence families exhibited increased disorder, suggesting DROSHA- and DICER1-dependent microRNA processing variability. Cell-specific patterns of microRNA expression were used to de-convolute variable cellular composition of colon and adipose tissue samples, highlighting one use of these cell-specific microRNA expression data. Characterization of cellular microRNA expression across a wide variety of cell types provides a new understanding of this critical regulatory RNA species.
BackgroundThe vitreous humor is a transparent, gelatinous mass whose main constituent is water. It plays an important role in providing metabolic nutrient requirements of the lens, coordinating eye growth and providing support to the retina. It is in close proximity to the retina and reflects many of the changes occurring in this tissue. The biochemical changes occurring in the vitreous could provide a better understanding about the pathophysiological processes that occur in vitreoretinopathy. In this study, we investigated the proteome of normal human vitreous humor using high resolution Fourier transform mass spectrometry.ResultsThe vitreous humor was subjected to multiple fractionation techniques followed by LC-MS/MS analysis. We identified 1,205 proteins, 682 of which have not been described previously in the vitreous humor. Most proteins were localized to the extracellular space (24%), cytoplasm (20%) or plasma membrane (14%). Classification based on molecular function showed that 27% had catalytic activity, 10% structural activity, 10% binding activity, 4% cell and 4% transporter activity. Categorization for biological processes showed 28% participate in metabolism, 20% in cell communication and 13% in cell growth. The data have been deposited to the ProteomeXchange with identifier PXD000957.ConclusionThis large catalog of vitreous proteins should facilitate biomedical research into pathological conditions of the eye including diabetic retinopathy, retinal detachment and cataract.
Complementing genome sequence with deep transcriptome and proteome data could enable more accurate assembly and annotation of newly sequenced genomes. Here, we provide a proof-of-concept of an integrated approach for analysis of the genome and proteome of Anopheles stephensi, which is one of the most important vectors of the malaria parasite. To achieve broad coverage of genes, we carried out transcriptome sequencing and deep proteome profiling of multiple anatomically distinct sites. Based on transcriptomic data alone, we identified and corrected 535 events of incomplete genome assembly involving 1196 scaffolds and 868 protein-coding gene models. This proteogenomic approach enabled us to add 365 genes that were missed during genome annotation and identify 917 gene correction events through discovery of 151 novel exons, 297 protein extensions, 231 exon extensions, 192 novel protein start sites, 19 novel translational frames, 28 events of joining of exons, and 76 events of joining of adjacent genes as a single gene. Incorporation of proteomic evidence allowed us to change the designation of more than 87 predicted “noncoding RNAs” to conventional mRNAs coded by protein-coding genes. Importantly, extension of the newly corrected genome assemblies and gene models to 15 other newly assembled Anopheline genomes led to the discovery of a large number of apparent discrepancies in assembly and annotation of these genomes. Our data provide a framework for how future genome sequencing efforts should incorporate transcriptomic and proteomic analysis in combination with simultaneous manual curation to achieve near complete assembly and accurate annotation of genomes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.