Joint association analysis of multiple traits in a genome-wide association study (GWAS), i.e. a multivariate GWAS, offers several advantages over analyzing each trait in a separate GWAS. In this study we directly compared a number of multivariate GWAS methods using simulated data. We focused on six methods that are implemented in the software packages PLINK, SNPTEST, MultiPhen, BIMBAM, PCHAT and TATES, and also compared them to standard univariate GWAS, analysis of the first principal component of the traits, and meta-analysis of univariate results. We simulated data (N = 1000) for three quantitative traits and one bi-allelic quantitative trait locus (QTL), and varied the number of traits associated with the QTL (explained variance 0.1%), minor allele frequency of the QTL, residual correlation between the traits, and the sign of the correlation induced by the QTL relative to the residual correlation. We compared the power of the methods using empirically fixed significance thresholds (α = 0.05). Our results showed that the multivariate methods implemented in PLINK, SNPTEST, MultiPhen and BIMBAM performed best for the majority of the tested scenarios, with a notable increase in power for scenarios with an opposite sign of genetic and residual correlation. All multivariate analyses resulted in a higher power than univariate analyses, even when only one of the traits was associated with the QTL. Hence, use of multivariate GWAS methods can be recommended, even when genetic correlations between traits are weak.
This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected with the aim of augmenting the custom low-density Illumina BovineLD SNP chip (San Diego, CA) used in the Nordic countries. The single-marker analysis was done breed-wise on all 16 index traits included in the breeding goals for Nordic Holstein, Danish Jersey, and Nordic Red cattle plus the total merit index itself. Depending on the trait's economic weight, 15, 10, or 5 quantitative trait loci (QTL) were selected per trait per breed and 3 to 5 markers were selected to tag each QTL. After removing duplicate markers (same marker selected for more than one trait or breed) and filtering for high pairwise linkage disequilibrium and assaying performance on the array, a total of 1,623 QTL markers were selected for inclusion on the custom chip. Genomic prediction analyses were performed for Nordic and French Holstein and Nordic Red animals using either a genomic BLUP or a Bayesian variable selection model. When using the genomic BLUP model including the QTL markers in the analysis, reliability was increased by up to 4 percentage points for production traits in Nordic Holstein animals, up to 3 percentage points for Nordic Reds, and up to 5 percentage points for French Holstein. Smaller gains of up to 1 percentage point was observed for mastitis, but only a 0.5 percentage point increase was seen for fertility. When using a Bayesian model accuracies were generally higher with only 54k data compared with the genomic BLUP approach, but increases in reliability were relatively smaller when QTL markers were included. Results from this study indicate that the reliability of genomic prediction can be increased by including markers significant in genome-wide association studies on whole genome sequence data alongside the 54k SNP set.
The application of Gibbs sampling is considered for inference in a mixed inheritance model in animal populations. Implementation of the Gibbs sampler on scalar components, as used for human populations, appeared not to be efficient, and an approach with blockwise sampling of genotypes was proposed for use in animal populations. The blockwise sampling of genotypes was proposed for use in animal populations. The blockwise sampling by which genotypes of a sire and its final progeny were sampled jointly was effective in improving mixing, although further improvements could be looked for. Posterior densities of parameters were visualised from Gibbs samples; from the former highly marginalised Bayesian point and interval estimates can be obtained.
Wheat breeding programs generate a large amount of variation which cannot be completely explored because of limited phenotyping throughput. Genomic prediction (GP) has been proposed as a new tool which provides breeding values estimations without the need of phenotyping all the material produced but only a subset of it named training population (TP). However, genotyping of all the accessions under analysis is needed and, therefore, optimizing TP dimension and genotyping strategy is pivotal to implement GP in commercial breeding schemes. Here, we explored the optimum TP size and we integrated pedigree records and genome wide association studies (GWAS) results to optimize the genotyping strategy. A total of 988 advanced wheat breeding lines were genotyped with the Illumina 15K SNPs wheat chip and phenotyped across several years and locations for yield, lodging, and starch content. Cross-validation using the largest possible TP size and all the SNPs available after editing (~11k), yielded predictive abilities (rGP) ranging between 0.5–0.6. In order to explore the Training population size, rGP were computed using progressively smaller TP. These exercises showed that TP of around 700 lines were enough to yield the highest observed rGP. Moreover, rGP were calculated by randomly reducing the SNPs number. This showed that around 1K markers were enough to reach the highest observed rGP. GWAS was used to identify markers associated with the traits analyzed. A GWAS-based selection of SNPs resulted in increased rGP when compared with random selection and few hundreds SNPs were sufficient to obtain the highest observed rGP. For each of these scenarios, advantages of adding the pedigree information were shown. Our results indicate that moderate TP sizes were enough to yield high rGP and that pedigree information and GWAS results can be used to greatly optimize the genotyping strategy.
Various models have been used for genomic prediction. Bayesian variable selection models often predict more accurate genomic breeding values than genomic BLUP (GBLUP), but GBLUP is generally preferred for routine genomic evaluations because of low computational demand. The objective of this study was to achieve the benefits of both models using results from Bayesian models and genome-wide association studies as weights on single nucleotide polymorphism (SNP) markers when constructing the genomic matrix (G-matrix) for genomic prediction. The data comprised 5,221 progeny-tested bulls from the Nordic Holstein population. The animals were genotyped using the Illumina Bovine SNP50 BeadChip (Illumina Inc., San Diego, CA). Weighting factors in this investigation were the posterior SNP variance, the square of the posterior SNP effect, and the corresponding minus base-10 logarithm of the marker association P-value [-log10(P)] of a t-test obtained from the analysis using a Bayesian mixture model with 4 normal distributions, the square of the estimated SNP effect, and the corresponding -log10(P) of a t-test obtained from the analysis using a classical genome-wide association study model (linear regression model). The weights were derived from the analysis based on data sets that were 0, 1, 3, or 5 yr before performing genomic prediction. In building a G-matrix, the weights were assigned either to each marker (single-marker weighting) or to each group of approximately 5 to 150 markers (group-marker weighting). The analysis was carried out for milk yield, fat yield, protein yield, fertility, and mastitis. Deregressed proofs (DRP) were used as response variables to predict genomic estimated breeding values (GEBV). Averaging over the 5 traits, the Bayesian model led to 2.0% higher reliability of GEBV than the GBLUP model with an original unweighted G-matrix. The superiority of using a GBLUP with weighted G-matrix over GBLUP with an original unweighted G-matrix was the largest when using a weighting factor of posterior variance, resulting in 1.7 percentage points higher reliability. The second best weighting factors were -log10 (P-value) of a t-test corresponding to the square of the posterior SNP effect from the Bayesian model and -log10 (P-value) of a t-test corresponding to the square of the estimated SNP effect from the linear regression model, followed by the square of estimated SNP effect and the square of the posterior SNP effect. In addition, group-marker weighting performed better than single-marker weighting in terms of reducing bias of GEBV, and also slightly increased prediction reliability. The differences between weighting factors and scenarios were larger in prediction bias than in prediction accuracy. Finally, weights derived from a data set having a lag up to 3 yr did not reduce reliability of GEBV. The results indicate that posterior SNP variance estimated from a Bayesian mixture model is a good alternative weighting factor, and common weights on group markers with a size of 30 markers is a good strategy when u...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.