Microarray and deep sequencing technologies have provided unprecedented opportunities for mapping genome mutations, RNA transcripts, transcription factor binding, and histone modifications at high resolution at the genome-wide level. This has revolutionized the way in which transcriptomes, regulatory networks and epigenetic regulations have been studied and large amounts of heterogeneous data have been generated. Although efforts are being made to integrate these datasets unbiasedly and efficiently, how best to do this still remains a challenge. Here we review major impacts of high-throughput genome-wide data generation, their relevance to human diseases, and various bioinformatics approaches for data integration. Finally, we provide a case study on inflammatory diseases.
genomics, epigenomics, phenomics, integration, data analysis
Citation:Yang L, Wei G, Tang K, et al. Understanding human diseases with high-throughput quantitative measurement and analysis of molecular signatures. Sci China Life Sci, 2013Sci, , 56: 213 -219, doi: 10.1007 Twelve years ago, the unveiling of the first human reference genome sequence [1,2] inspired researchers to believe that genome-based discoveries would revolutionize the study and clinical treatment of human diseases. As genome sequences from different individuals became available, comparative genomics using computational approaches emerged as a powerful method for understanding gene functions at the genome-wide level. These approaches unveiled more variations between individuals than were initially expected [3]. Genomic variations (including single nucleotide polymorphism (SNPs) and insertions and deletions (indels)) responsible for some of hereditary diseases have been identified and applied to examine genomes of thousands of individuals for correlations between the presence of variants and traits of interests [4]. First microarrays were used, then exon sequencing, and now whole genome sequencing has become a popular tool [5,6]. Currently, many variations from numerous sites in the genome have been successfully connected with different human diseases including various types of cancers [6] using DNA sequencing technology which underwent a 14000-fold drop in cost between 1999 and 2009 [7], and computational imputation methods [8].Though most studies have focused on the connection between genomic variations (both common and rare) and human diseases, mechanisms underlying many of the DNA variations have not been clearly addressed. Genome information alone is not sufficient to interpret complex diseases [9]. Evidence at epigenome, post-transcriptome, and even the human microbiome levels is beginning to shed new light on human disease-related studies beyond the genome level.