Yuanping Du scite author profile

Whole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch parent-offspring families and constructed a haplotype map of 20.4 million single-nucleotide variants and 1.2 million insertions and deletions. The intermediate coverage (∼13×) and trio design enabled extensive characterization of structural variation, including midsize events (30-500 bp) previously poorly catalogued and de novo mutations. We demonstrate that the quality of the haplotypes boosts imputation accuracy in independent samples, especially for lower frequency alleles. Population genetic analyses demonstrate fine-scale structure across the country and support multiple ancient migrations, consistent with historical changes in sea level and flooding. The GoNL Project illustrates how single-population whole-genome sequencing can provide detailed characterization of genetic variation and may guide the design of future population studies.

show abstract

The Genome of the Netherlands: design, and project goals

Boomsma¹,

et al. 2013

View full text Add to dashboard Cite

Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent–offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910–1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14–15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project.

show abstract

Whole-Genome Sequencing Uncovers the Genetic Basis of Chronic Mountain Sickness in Andean Highlanders

Zhou

Udpa

Ronen

et al. 2013

The American Journal of Human Genetics

119

120

View full text Add to dashboard Cite

The hypoxic conditions at high altitudes present a challenge for survival, causing pressure for adaptation. Interestingly, many high-altitude denizens (particularly in the Andes) are maladapted, with a condition known as chronic mountain sickness (CMS) or Monge disease. To decode the genetic basis of this disease, we sequenced and compared the whole genomes of 20 Andean subjects (10 with CMS and 10 without). We discovered 11 regions genome-wide with significant differences in haplotype frequencies consistent with selective sweeps. In these regions, two genes (an erythropoiesis regulator, SENP1, and an oncogene, ANP32D) had a higher transcriptional response to hypoxia in individuals with CMS relative to those without. We further found that downregulating the orthologs of these genes in flies dramatically enhanced survival rates under hypoxia, demonstrating that suppression of SENP1 and ANP32D plays an essential role in hypoxia tolerance. Our study provides an unbiased framework to identify and validate the genetic basis of adaptation to high altitudes and identifies potentially targetable mechanisms for CMS treatment.

show abstract

IMonitor: A Robust Pipeline for TCR and BCR Repertoire Analysis

Zhang

et al. 2015

109

116

View full text Add to dashboard Cite

The advance of next generation sequencing (NGS) techniques provides an unprecedented opportunity to probe the enormous diversity of the immune repertoire by deep sequencing T-cell receptors (TCRs) and B-cell receptors (BCRs). However, an efficient and accurate analytical tool is still on demand to process the huge amount of data. We have developed a high-resolution analytical pipeline, Immune Monitor ("IMonitor") to tackle this task. This method utilizes realignment to identify V(D)J genes and alleles after common local alignment. We compare IMonitor with other published tools by simulated and public rearranged sequences, and it demonstrates its superior performance in most aspects. Together with this, a methodology is developed to correct the PCR and sequencing errors and to minimize the PCR bias among various rearranged sequences with different V and J gene families. IMonitor provides general adaptation for sequences from all receptor chains of different species and outputs useful statistics and visualizations. In the final part of this article, we demonstrate its application on minimal residual disease detection in patients with B-cell acute lymphoblastic leukemia. In summary, this package would be of widespread usage for immune repertoire analysis.KEYWORDS next generation sequencing; bioinformatics; immune repertoire; TCR/BCR T HE diversity of T-cell receptors (TCRs), B-cell receptors (BCRs), and secreting form antibodies makes up the core of the complicated immune system and serves as pivotal defensive components to protect the body against invading virus, bacteria, and other pathogens. The TCR consists of a heterodimeric ab chain (95%, TRA, TRB) or gd chain (5%), while the BCR is assembled with two heavy chains (IGH) and two light chains (IGK or IGL). Structurally, each chain can be divided into the variable domain and the constant domain (Lefranc and Lefranc 2001a,b). The diversity of the TCR and BCR repertoire is enormous, owing to the process of V(D)J gene rearrangement, random deletion of germline nucleotides, and insertion of uncertain length of nontemplate nucleotides between V-D and D-J junctions (TRB, IGH) or V-J junctions (TRA, IGK, IGL). In humans, it has been estimated theoretically that the diversity of TCR-ab receptors exceeds 10 18 in the thymus, and the diversity of the B-cell repertoire is even larger, considering the somatic hypermutation (Janeway 2005;Benichou et al. 2012). The T-and B-cell repertoire could undergo dynamic changes under different phenotypic status. Recently, deep sequencing enabled by different platforms including Roche 454 and Illumina Hiseq (Freeman et al. 2009;Robins et al. 2009;Wang et al. 2010;Fischer 2011;Venturi et al. 2011) has been applied to unravel the dynamics of the TCR and BCR repertoire and extended to various translational applications such as vaccination, cancer, and autoimmune diseases.Several tools and software have been developed for TCR and BCR sequence analysis, including iHMMune-align (Gaeta et al. 2007), HighV-QEUST (Li et al. 2013), IgBLA...

show abstract

PrecisionFDA Truth Challenge V2: Calling variants from short and long reads in difficult-to-map regions

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yuanping Du

Whole-genome sequence variation, population structure and demographic history of the Dutch population

The Genome of the Netherlands: design, and project goals

Whole-Genome Sequencing Uncovers the Genetic Basis of Chronic Mountain Sickness in Andean Highlanders

IMonitor: A Robust Pipeline for TCR and BCR Repertoire Analysis

PrecisionFDA Truth Challenge V2: Calling variants from short and long reads in difficult-to-map regions

Contact Info

Product

Resources

About