Highlights d Slow delivery immunization enhances HIV neutralizing antibody development in monkeys d Slow delivery immunization alters immunodominance of the responding B cells d Weekly longitudinal germinal center (GC) B and T FH analyses provides new GC insights d High-resolution rhesus immunoglobulin locus genomic reference sequence
An incomplete ascertainment of genetic variation within the highly polymorphic immunoglobulin heavy chain locus (IGH) has hindered our ability to define genetic factors that influence antibody and B cell mediated processes. To date, methods for locus-wide genotyping of all IGH variant types do not exist. Here, we combine targeted long-read sequencing with a novel bioinformatics tool, IGenotyper, to fully characterize genetic variation within IGH in a haplotype-specific manner. We apply this approach to eight human samples, including a haploid cell line and two mother-father-child trios, and demonstrate the ability to generate high-quality assemblies (>98% complete and >99% accurate), genotypes, and gene annotations, including 2 novel structural variants and 17 novel gene alleles. We show that multiplexing allows for scaling of the approach without impacting data quality, and that our genotype call sets are more accurate than short-read (>35% increase in true positives and >97% decrease in false-positives) and array/imputation-based datasets. This framework establishes a foundation for leveraging IG genomic data to study population-level variation in the antibody response.
The contribution of heritable factors to antibody function and diversity is not fully understood, but has profound implications for delineating variation in the antibody response observed at the population-level. We performed matched long-read-based characterization of the immunoglobulin heavy chain (IGH) locus and expressed antibody repertoire profiling at population-scale to examine, for the first time, the impact of IGH genomic variation on the antibody repertoire. We characterized extensive IGH polymorphism, including novel structural variants (SVs), small insertion/deletions (indels), single nucleotide variants (SNVs), and IG genes and alleles. Countering models that antibody repertoire diversity is driven largely by stochastic processes, we demonstrate that IGH genetic factors make significant contributions to gene usage in both the naive and antigen-experienced repertoire. Specifically, the usage of 73% of IGH genes was associated with common polymorphisms, including those capable of explaining >70% of variance in gene usage. These variants were enriched in transcription factor binding sites and other functional elements associated with V(D)J recombination, and overlapped polymorphisms from genome-wide association studies. Furthermore, we found evidence for the coordinated regulation of IGH genes across the repertoire, demonstrating complex interactions between IGH variants and gene usage. These results refine our understanding of variation observed in the antibody repertoire, and will advance the study of antibody function in disease.
Immunoglobulins (IGs), crucial components of the adaptive immune system, are encoded by three genomic loci. However, the complexity of the IG loci severely limits the effective use of short read sequencing, limiting our knowledge of population diversity in these loci. We leveraged existing long read whole-genome sequencing (WGS) data, fosmid technology, and IG targeted single-molecule, real-time (SMRT) long-read sequencing (IG-Cap) to create haplotype-resolved assemblies of the IG Lambda (IGL) locus from 6 ethnically diverse individuals. In addition, we generated 10 diploid assemblies of IGL from a diverse cohort of individuals utilizing IG-cap. From these 16 individuals, we identi ed signi cant allelic diversity, including 36 novel IGLV alleles. In addition, we observed highly elevated single nucleotide variation (SNV) in IGLV genes relative to IGL intergenic and genomic background SNV density. By comparing SNV calls between our high quality assemblies and existing short read datasets from the same individuals, we show a high propensity for false-positives in the short read datasets. Finally, for the rst time, we nucleotide-resolved common 5-10 Kb duplications in the IGLC region that contain functional IGLJ and IGLC genes. Together these data represent a signi cant advancement in our understanding of genetic variation and population diversity in the IGL locus.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.