The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
Paralleling the diversity of genetic and protein activities, pathologic human tissues also exhibit diverse radiographic features. Here we show that dynamic imaging traits in non-invasive computed tomography (CT) systematically correlate with the global gene expression programs of primary human liver cancer. Combinations of twenty-eight imaging traits can reconstruct 78% of the global gene expression profiles, revealing cell proliferation, liver synthetic function, and patient prognosis. Thus, genomic activity of human liver cancers can be decoded by noninvasive imaging, thereby enabling noninvasive, serial and frequent molecular profiling for personalized medicine.
Effective diagnosis and surveillance of complex multi-factorial disorders such as cancer can be improved by screening of easily accessible biomarkers. Highly stable cell free Circulating Nucleic Acids (CNA) present as both RNA and DNA species have been discovered in the blood and plasma of humans. Correlations between tumor-associated genomic/epigenetic/transcriptional changes and alterations in CNA levels are strong predictors of the utility of this biomarker class as promising clinical indicators. Towards this goal microRNAs (miRNAs) representing a class of naturally occurring small non-coding RNAs of 19–25 nt in length have emerged as an important set of markers that can associate their specific expression profiles with cancer development. In this study we investigate some of the pre-analytic considerations for isolating plasma fractions for the study of miRNA biomarkers. We find that measurement of circulating miRNA levels are frequently confounded by varying levels of cellular miRNAs of different hematopoietic origins. In order to assess the relative proportions of this cell-derived class, we have fractionated whole blood into plasma and its ensuing sub-fractions. Cellular miRNA signatures in cohorts of normal individuals are catalogued and the abundance and gender specific expression of bona fide circulating markers explored after calibrating the signal for this interfering class. A map of differentially expressed profiles is presented and the intrinsic variability of circulating miRNA species investigated in subsets of healthy males and females.
The success of genome-wide association studies has paralleled the development of efficient genotyping technologies. We describe the development of a next-generation microarray based on the new highly-efficient Affymetrix Axiom genotyping technology that we are using to genotype individuals of European ancestry from the Kaiser Permanente Research Program on Genes, Environment and Health (RPGEH). The array contains 674,517 SNPs, and provides excellent genome-wide as well as gene-based and candidate-SNP coverage. Coverage was calculated using an approach based on imputation and cross validation. Preliminary results for the first 80,301 saliva-derived DNA samples from the RPGEH demonstrate very high quality genotypes, with sample success rates above 94% and over 98% of successful samples having SNP call rates exceeding 98%. At steady state, we have produced 462 million genotypes per week for each Axiom system. The new array provides a valuable addition to the repertoire of tools for large scale genome-wide association studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.