Genotype imputation is widely used as a cost-effective strategy in genomic evaluation of cattle. Key determinants of imputation accuracies, such as linkage disequilibrium patterns, marker densities, and ascertainment bias, differ between Bos indicus and Bos taurus breeds. Consequently, there is a need to investigate effectiveness of genotype imputation in indicine breeds. Thus, the objective of the study was to investigate strategies and factors affecting the accuracy of genotype imputation in Gyr (Bos indicus) dairy cattle. Four imputation scenarios were studied using 471 sires and 1,644 dams genotyped on Illumina BovineHD (HD-777K; San Diego, CA) and BovineSNP50 (50K) chips, respectively. Scenarios were based on which reference high-density single nucleotide polymorphism (SNP) panel (HDP) should be adopted [HD-777K, 50K, and GeneSeek GGP-75Ki (Lincoln, NE)]. Depending on the scenario, validation animals had their genotypes masked for one of the lower-density panels: Illumina (3K, 7K, and 50K) and GeneSeek (SGGP-20Ki and GGP-75Ki). We randomly selected 171 sires as reference and 300 as validation for all the scenarios. Additionally, all sires were used as reference and the 1,644 dams were imputed for validation. Genotypes of 98 individuals with 4 and more offspring were completely masked and imputed. Imputation algorithms FImpute and Beagle v3.3 and v4 were used. Imputation accuracies were measured using the correlation and allelic correct rate. FImpute resulted in highest accuracies, whereas Beagle 3.3 gave the least-accurate imputations. Accuracies evaluated as correlation (allelic correct rate) ranged from 0.910 (0.942) to 0.961 (0.974) using 50K as HDP and with 3K (7K) as low-density panels. With GGP-75Ki as HDP, accuracies were moderate for 3K, 7K, and 50K, but high for SGGP-20Ki. The use of HD-777K as HDP resulted in accuracies of 0.888 (3K), 0.941 (7K), 0.980 (SGGP-20Ki), 0.982 (50K), and 0.993 (GGP-75Ki). Ungenotyped individuals were imputed with an average accuracy of 0.970. The average top 5 kinship coefficients between reference and imputed individuals was a strong predictor of imputation accuracy. FImpute was faster and used less memory than Beagle v4. Beagle v4 outperformed Beagle v3.3 in accuracy and speed of computation. A genotyping strategy that uses the HD-777K SNP chip as a reference panel and SGGP-20Ki as the lower-density SNP panel should be adopted as accuracy was high and similar to that of the 50K. However, the effect of using imputed HD-777K genotypes from the SGGP-20Ki on genomic evaluation is yet to be studied.
SummaryTogether with their sister subspecies Bos taurus, zebu cattle (Bos indicus) have contributed to important socioeconomic changes that have shaped modern civilizations. Zebu cattle were domesticated in the Indus Valley 8000 years before present (YBP). From the domestication site, they expanded to Africa, East Asia, southwestern Asia and Europe between 4000 and 1300 YBP, intercrossing with B. taurus to form clinal variations of zebu ancestry across the landmass of Afro‐Eurasia. In the past 150 years, zebu cattle reached the Americas and Oceania, where they have contributed to the prosperity of emerging economies. The zebu genome is characterized by two mitochondrial haplogroups (I1 and I2), one Y chromosome haplogroup (Y3) and three major autosomal ancestral groups (Indian‐Pakistani, African and Chinese). Phenotypically, zebu animals are recognized by their hump, large ears and excess skin. They are rustic, resilient to parasites and capable of bearing the hot and humid climates of the tropics. Many resources are available to study the zebu genome, including commercial arrays of SNP, reference assemblies and publicly available genotypes and whole‐genome sequences. Nevertheless, many of these resources were initially developed to support research and subsidize industrial applications in B. taurus, and therefore they can produce bias in data analysis. The combination of genomics with precision agriculture holds great promise for the identification of genetic variants affecting economically important traits such as tick resistance and heat tolerance, which were naturally selected for millennia and played a major role in the evolution of B. indicus cattle.
Background: Ending the COVID-19 pandemic is arguably one of the most prominent challenges in recent human history. Following closely the growth dynamics of the disease is one of the pillars toward achieving that goal. Objective: We aimed at developing a simple framework to facilitate the analysis of the growth rate (cases/day) and growth acceleration (cases/day 2) of COVID-19 cases in real-time. Methods: The framework was built using the Moving Regression (MR) technique and a Hidden Markov Model (HMM). The dynamics of the pandemic was initially modeled via combinations of four different growth stages: lagging (beginning of the outbreak), exponential (rapid growth), deceleration (growth decay), and stationary (near zero growth). A fifth growth behavior, namely linear growth (constant growth above zero), was further introduced to add more flexibility to the framework. An R Shiny application was developed, which can be accessed at https://theguarani.com.br/ or downloaded from https://github.com/adamtaiti/SARS-CoV-2. The framework was applied to data from the European Center for Disease Prevention and Control (ECDC), which comprised 3,722,128 cases reported worldwide as of May 8th 2020. Results: We found that the impact of public health measures on the prevalence of COVID-19 could be perceived in seemingly real-time by monitoring growth acceleration curves. Restriction to human mobility produced detectable decline in growth acceleration within 1 week, deceleration within ∼2 weeks and near-stationary growth within ∼6 weeks. Countries exhibiting different permutations of the five growth stages indicated that the evolution of COVID-19 prevalence is more complex and dynamic than previously appreciated. Conclusions: These results corroborate that mass social isolation is a highly effective measure against the dissemination of SARS-CoV-2, as previously suggested. Apart from Utsunomiya et al. COVID-19 Real-Time Acceleration the analysis of prevalence partitioned by country, the proposed framework is easily applicable to city, state, region and arbitrary territory data, serving as an asset to monitor the local behavior of COVID-19 cases.
BackgroundMisassembly signatures, created by shuffling the order of sequences while assembling a genome, can be detected by the unexpected behavior of marker linkage disequilibrium (LD) decay. We developed a heuristic process to identify misassembly signatures, applied it to the bovine reference genome assembly (UMDv3.1) and presented the consequences of misassemblies in two case studies.ResultsWe identified 2,906 single nucleotide polymorphism (SNP) markers presenting unexpected LD decay behavior in 626 putative misassembled contigs, which comprised less than 1 % of the whole genome. Although this represents a small fraction of the reference sequence, these poorly assembled segments can lead to severe implications to local genome context. For instance, we showed that one of the misassembled regions mapped to the POLL locus, which affected the annotation of positional candidate genes in a GWAS case study for polledness in Nellore (Bos indicus beef cattle). Additionally, we found that poorly performing markers in imputation mapped to putative misassembled regions, and that correction of marker positions based on LD was capable to recover imputation accuracy.ConclusionsThis heuristic approach can be useful to cross validate reference assemblies and to filter out markers located at low confidence genomic regions before conducting downstream analyses.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-3049-8) contains supplementary material, which is available to authorized users.
Genomic selection may accelerate genetic progress in breeding programs of indicine breeds when compared with traditional selection methods. We present results of genomic predictions in Gyr (Bos indicus) dairy cattle of Brazil for milk yield (MY), fat yield (FY), protein yield (PY), and age at first calving using information from bulls and cows. Four different single nucleotide polymorphism (SNP) chips were studied. Additionally, the effect of the use of imputed data on genomic prediction accuracy was studied. A total of 474 bulls and 1,688 cows were genotyped with the Illumina BovineHD (HD; San Diego, CA) and BovineSNP50 (50K) chip, respectively. Genotypes of cows were imputed to HD using FImpute v2.2. After quality check of data, 496,606 markers remained. The HD markers present on the GeneSeek SGGP-20Ki (15,727; Lincoln, NE), 50K (22,152), and GeneSeek GGP-75Ki (65,018) were subset and used to assess the effect of lower SNP density on accuracy of prediction. Deregressed breeding values were used as pseudophenotypes for model training. Data were split into reference and validation to mimic a forward prediction scheme. The reference population consisted of animals whose birth year was ≤2004 and consisted of either only bulls (TR1) or a combination of bulls and dams (TR2), whereas the validation set consisted of younger bulls (born after 2004). Genomic BLUP was used to estimate genomic breeding values (GEBV) and reliability of GEBV (R) was based on the prediction error variance approach. Reliability of GEBV ranged from ∼0.46 (FY and PY) to 0.56 (MY) with TR1 and from 0.51 (PY) to 0.65 (MY) with TR2. When averaged across all traits, R were substantially higher (R of TR1 = 0.50 and TR2 = 0.57) compared with reliabilities of parent averages (0.35) computed from pedigree data and based on diagonals of the coefficient matrix (prediction error variance approach). Reliability was similar for all the 4 marker panels using either TR1 or TR2, except that imputed HD cow data set led to an inflation of reliability. Reliability of GEBV could be increased by enlarging the limited bull reference population with cow information. A reduced panel of ∼15K markers resulted in reliabilities similar to using HD markers. Reliability of GEBV could be increased by enlarging the limited bull reference population with cow information.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.