Recent advances in sequencing technologies have initiated an era of personal genome sequences. To date, human genome sequences have been reported for individuals with ancestry in three distinct geographical regions: a Yoruba African, two individuals of north-west European origin, and a person from China1–4. Here we provide a highly annotated, whole-genome sequence for a Korean individual, known as AK1. The genome of AK1 was determined by an exacting, combined approach that included whole-genome shotgun sequencing (27.8× coverage), targeted bacterial artificial chromosome sequencing, and high-resolution comparative genomic hybridization using custom microarrays featuring more than 24 million probes. Alignment to the NCBI reference, a composite of several ethnic clades5,6, disclosed nearly 3.45 million single nucleotide polymorphisms (SNPs), including 10,162 non-synonymous SNPs, and 170,202 deletion or insertion polymorphisms (indels). SNP and indel densities were strongly correlated genome-wide. Applying very conservative criteria yielded highly reliable copy number variants for clinical considerations. Potential medical phenotypes were annotated for non-synonymous SNPs, coding domain indels, and structural variants. The integration of several human whole-genome sequences derived from several ethnic groups will assist in understanding genetic ancestry, migration patterns and population bottlenecks.
Massively parallel sequencing technologies have identified a broad spectrum of human genome diversity. Here we deep sequenced and correlated 18 genomes and 17 transcriptomes of unrelated Korean individuals. This has allowed us to construct a genome-wide map of common and rare variants and also identify variants formed during DNA-RNA transcription. We identified 9.56 million genomic variants, 23.2% of which appear to be previously unidentified. From transcriptome sequencing, we discovered 4,414 transcripts not previously annotated. Finally, we revealed 1,809 sites of transcriptional base modification, where the transcriptional landscape is different from the corresponding genomic sequences, and 580 sites of allele-specific expression. Our findings suggest that a considerable number of unexplored genomic variants still remain to be identified in the human genome, and that the integrated analysis of genome and transcriptome sequencing is powerful for understanding the diversity and functional aspects of human genomic variants.
Reprogramming of somatic cells to induced pluripotent stem cells involves a dynamic rearrangement of the epigenetic landscape. To characterize this epigenomic roadmap, we have performed MethylC-seq, ChIP-seq (H3K4/K27/K36me3) and RNA-Seq on samples taken at several time points during murine secondary reprogramming as part of Project Grandiose. We find that DNA methylation gain during reprogramming occurs gradually, while loss is achieved only at the ESC-like state. Binding sites of activated factors exhibit focal demethylation during reprogramming, while ESC-like pluripotent cells are distinguished by extension of demethylation to the wider neighbourhood. We observed that genes with CpG-rich promoters demonstrate stable low methylation and strong engagement of histone marks, whereas genes with CpG-poor promoters are safeguarded by methylation. Such DNA methylation-driven control is the key to the regulation of ESC-pluripotency genes, including Dppa4, Dppa5a and Esrrb. These results reveal the crucial role that DNA methylation plays as an epigenetic switch driving somatic cells to pluripotency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.