Many genomic analyses start by aligning sequencing reads to a linear reference genome. However, linear reference genomes are imperfect, lacking millions of bases of unknown relevance and are unable to reflect the genetic diversity of populations. This makes reference-guided methods susceptible to reference-allele bias. To overcome such limitations, we build a pangenome from six reference-quality assemblies from taurine and indicine cattle as well as yak. The pangenome contains an additional 70,329,827 bases compared to the Bos taurus reference genome. Our multiassembly approach reveals 30 and 10.1 million bases private to yak and indicine cattle, respectively, and between 3.3 and 4.4 million bases unique to each taurine assembly. Utilizing transcriptomes from 56 cattle, we show that these nonreference sequences encode transcripts that hitherto remained undetected from the B. taurus reference genome. We uncover genes, primarily encoding proteins contributing to immune response and pathogen-mediated immunomodulation, differentially expressed between Mycobacterium bovis–infected and noninfected cattle that are also undetectable in the B. taurus reference genome. Using whole-genome sequencing data of cattle from five breeds, we show that reads which were previously misaligned against the Bos taurus reference genome now align accurately to the pangenome sequences. This enables us to discover 83,250 polymorphic sites that segregate within and between breeds of cattle and capture genetic differentiation across breeds. Our work makes a so-far unused source of variation amenable to genetic investigations and provides methods and a framework for establishing and exploiting a more diverse reference genome.
Background Genotyping of sequence variants typically involves, as a first step, the alignment of sequencing reads to a linear reference genome. Because a linear reference genome represents only a small fraction of all the DNA sequence variation within a species, reference allele bias may occur at highly polymorphic or divergent regions of the genome. Graph-based methods facilitate the comparison of sequencing reads to a variation-aware genome graph, which incorporates a collection of non-redundant DNA sequences that segregate within a species. We compared the accuracy and sensitivity of graph-based sequence variant genotyping using the Graphtyper software to two widely-used methods, i.e., GATK and SAMtools , which rely on linear reference genomes using whole-genome sequencing data from 49 Original Braunvieh cattle. Results We discovered 21,140,196, 20,262,913, and 20,668,459 polymorphic sites using GATK , Graphtyper, and SAMtools , respectively. Comparisons between sequence variant genotypes and microarray-derived genotypes showed that Graphtyper outperformed both GATK and SAMtools in terms of genotype concordance, non-reference sensitivity, and non-reference discrepancy. The sequence variant genotypes that were obtained using Graphtyper had the smallest number of Mendelian inconsistencies between sequence-derived single nucleotide polymorphisms and indels in nine sire-son pairs. Genotype phasing and imputation using the Beagle software improved the quality of the sequence variant genotypes for all the tools evaluated, particularly for animals that were sequenced at low coverage. Following imputation, the concordance between sequence- and microarray-derived genotypes was almost identical for the three methods evaluated, i.e., 99.32, 99.46, and 99.24% for GATK , Graphtyper, and SAMtools, respectively. Variant filtration based on commonly used criteria improved genotype concordance slightly but it also decreased sensitivity. Graphtyper required considerably more computing resources than SAMtools but less than GATK . Conclusions Sequence variant genotyping using Graphtyper is accurate, sensitive and computationally feasible in cattle. Graph-based methods enable sequence variant genotyping from variation-aware reference genomes that may incorporate cohort-specific sequence variants, which is not possible with the current implementation of state-of-the-art methods that rely on linear reference genomes. Electronic supplementary material Th...
Background: Autochthonous cattle breeds are an important source of genetic variation because they might carry alleles that enable them to adapt to local environment and food conditions. Original Braunvieh (OB) is a local cattle breed of Switzerland used for beef and milk production in alpine areas. Using whole-genome sequencing (WGS) data of 49 key ancestors, we characterize genomic diversity, genomic inbreeding, and signatures of selection in Swiss OB cattle at nucleotide resolution. Results: We annotated 15,722,811 SNPs and 1,580,878 Indels including 10,738 and 2763 missense deleterious and high impact variants, respectively, that were discovered in 49 OB key ancestors. Six Mendelian trait-associated variants that were previously detected in breeds other than OB, segregated in the sequenced key ancestors including variants causal for recessive xanthinuria and albinism. The average nucleotide diversity (1.6 × 10 − 3) was higher in OB than many mainstream European cattle breeds. Accordingly, the average genomic inbreeding derived from runs of homozygosity (ROH) was relatively low (F ROH = 0.14) in the 49 OB key ancestor animals. However, genomic inbreeding was higher in OB cattle of more recent generations (F ROH = 0.16) due to a higher number of long (> 1 Mb) runs of homozygosity. Using two complementary approaches, composite likelihood ratio test and integrated haplotype score, we identified 95 and 162 genomic regions encompassing 136 and 157 protein-coding genes, respectively, that showed evidence (P < 0.005) of past and ongoing selection. These selection signals were enriched for quantitative trait loci related to beef traits including meat quality, feed efficiency and body weight and pathways related to blood coagulation, nervous and sensory stimulus. Conclusions: We provide a comprehensive overview of sequence variation in Swiss OB cattle genomes. With WGS data, we observe higher genomic diversity and less inbreeding in OB than many European mainstream cattle breeds. Footprints of selection were detected in genomic regions that are possibly relevant for meat quality and adaptation to local environmental conditions. Considering that the population size is low and genomic inbreeding increased in the past generations, the implementation of optimal mating strategies seems warranted to maintain genetic diversity in the Swiss OB cattle population.
Cattle are ideally suited to investigate the genetics of male reproduction, because semen quality and fertility are recorded for all ejaculates of artificial insemination bulls. We analysed 26,090 ejaculates of 794 Brown Swiss bulls to assess ejaculate volume, sperm concentration, sperm motility, sperm head and tail anomalies and insemination success. The heritability of the six semen traits was between 0 and 0.26. Genome-wide association testing on 607,511 SNPs revealed a QTL on bovine chromosome 6 that was associated with sperm motility (P = 2.5 x 10 −27), head (P = 2.0 x 10 −44) and tail anomalies (P = 7.2 x 10 −49) and insemination success (P = 9.9 x 10 −13). The QTL harbors a recessive allele that compromises semen quality and male fertility. We replicated the effect of the QTL on fertility (P = 7.1 x 10 −32) in an independent cohort of 2481 Brown Swiss bulls. The analysis of wholegenome sequencing data revealed that a synonymous variant (BTA6:58373887C>T, rs474302732) in WDR19 encoding WD repeat-containing protein 19 was in linkage disequilibrium with the fertility-associated haplotype. WD repeat-containing protein 19 is a constituent of the intraflagellar transport complex that is essential for the physiological function of motile cilia and flagella. Bioinformatic and transcription analyses revealed that the BTA6:58373887 T-allele activates a cryptic exonic splice site that eliminates three evolutionarily conserved amino acids from WDR19. Western blot analysis demonstrated that the BTA6:58373887 T-allele decreases protein expression. We make the remarkable observation that, in spite of negative effects on semen quality and bull fertility, the BTA6:58373887 T-allele has a frequency of 24% in the Brown Swiss population. Our findings are the first to uncover a variant that is associated with quantitative variation in semen quality and male fertility in cattle.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.