Discuss this article AbstractThe availability of whole genome sequence (WGS) data has made it possible to discover protein variants . However, existing bovine WGS databases do in silico not show data in a form conducive to protein variant analysis, and tend to under represent the breadth of genetic diversity in global beef cattle. Thus, our first aim was to use 96 beef sires, sharing minimal pedigree relationships, to create a searchable and publicly viewable set of mapped genomes relevant for 19 popular breeds of U.S. cattle. Our second aim was to identify protein variants encoded by the bovine endothelial PAS domain-containing protein 1 gene ( ), a gene associated with pulmonary hypertension in Angus cattle. The EPAS1 identity and quality of genomic sequences were verified by comparing WGS genotypes to those derived from other methods. The average read depth, genotype scoring rate, and genotype accuracy exceeded 14, 99%, and 99%, respectively. The 96 genomes were used to discover four amino acid variants encoded by (E270Q, P362L, A671G, and L701F) and confirm two EPAS1 variants previously associated with disease (A606T and G610S). The six missense mutations were verified with matrix-assisted laser EPAS1 desorption/ionization time-of-flight mass spectrometry assays, and their frequencies were estimated in a separate collection of 1154 U.S. cattle representing 46 breeds. A rooted phylogenetic tree of eight polypeptide sequences provided a framework for evaluating the likely order of mutations and potential impact of alleles on the adaptive response to chronic EPAS1 hypoxia in U.S. cattle. This public, whole genome resource facilitates in silico identification of protein variants in diverse types of U.S. beef cattle, and provides a means of translating WGS data into a practical biological and evolutionary context for generating and testing hypotheses.
The availability of whole genome sequence (WGS) data has made it possible to discover protein variants in silico. However, existing bovine WGS databases do not show data in a form conducive to protein variant analysis, and tend to under represent the breadth of genetic diversity in U.S. beef cattle. Thus, our first aim was to use 96 beef sires, sharing minimal pedigree relationships, to create a searchable and publicly viewable set of mapped genomes relevant for 19 popular breeds of U.S. cattle. Our second aim was to identify protein variants encoded by the bovine endothelial PAS domain-containing protein 1 gene ( EPAS1), a gene associated with high-altitude pulmonary hypertension in Angus cattle. The identity and quality of genomic sequences were verified by comparing WGS genotypes to those derived from other methods. The average read depth, genotype scoring rate, and genotype accuracy exceeded 14, 99%, and 99%, respectively. The 96 genomes were used to discover four amino acid variants encoded by EPAS1 (E270Q, P362L, A671G, and L701F) and confirm two variants previously associated with disease (A606T and G610S). The six EPAS1 missense mutations were verified with matrix-assisted laser desorption/ionization time-of-flight mass spectrometry assays, and their frequencies were estimated in a separate collection of 1154 U.S. cattle representing 46 breeds. A rooted phylogenetic tree of eight polypeptide sequences provided a framework for evaluating the likely order of mutations and potential impact of EPAS1 alleles on the adaptive response to chronic hypoxia in U.S. cattle. This public, whole genome resource facilitates in silico identification of protein variants in diverse types of U.S. beef cattle, and provides a means of translating WGS data into a practical biological and evolutionary context for generating and testing hypotheses.
Background: Access to sheep genome sequences significantly improves the chances of identifying genes that may influence the health, welfare, and productivity of these animals. Methods: A public, searchable DNA sequence resource for U.S. sheep was created with whole genome sequence (WGS) of 96 rams. The animals shared minimal pedigree relationships and represent nine popular U.S. breeds and a composite line. The genomes are viewable online with the user-friendly Integrated Genome Viewer environment, and may be used to identify and decode gene variants present in U.S. sheep. Results: The genomes had a combined average read depth of 16, and an average WGS genotype scoring rate and accuracy exceeding 99%. The utility of this resource was illustrated by characterizing three genes with 14 known coding variants affecting litter size in global sheep populations: growth and differentiation factor 9 ( GDF9), bone morphogenetic protein 15 ( BMP15), and bone morphogenetic protein receptor 1B ( BMPR1B). In the 96 U.S. rams, nine missense variants encoding 11 protein variants were identified. However, only one was previously reported to affect litter size ( GDF9 V371M, Finnsheep). Two missense variants in BMP15 were identified that had not previously been reported: R67Q in Dorset, and L252P in Dorper and White Dorper breeds. Also, two novel missense variants were identified in BMPR1B: M64I in Katahdin, and T345N in Romanov and Finnsheep breeds. Based on the strict conservation of amino acid residues across placental mammals, the four variants encoded by BMP15 and BMPR1B are predicted to interfere with their function. However, preliminary analyses of litter sizes in small samples did not reveal a correlation with variants in BMP15 and BMPR1B with daughters of these rams. Conclusions: Collectively, this report describes a new resource for discovering protein variants in silico and identifies alleles for further testing of their effects on litter size in U.S. breeds.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.