Given the drawbacks of implementing multivariate analysis for mapping multiple traits in genome-wide association study (GWAS), principal component analysis (PCA) has been widely used to generate independent 'super traits' from the original multivariate phenotypic traits for the univariate analysis. However, parameter estimates in this framework may not be the same as those from the joint analysis of all traits, leading to spurious linkage results. In this paper, we propose to perform the PCA for residual covariance matrix instead of the phenotypical covariance matrix, based on which multiple traits are transformed to a group of pseudo principal components. The PCA for residual covariance matrix allows analyzing each pseudo principal component separately. In addition, all parameter estimates are equivalent to those obtained from the joint multivariate analysis under a linear transformation. However, a fast least absolute shrinkage and selection operator (LASSO) for estimating the sparse oversaturated genetic model greatly reduces the computational costs of this procedure. Extensive simulations show statistical and computational efficiencies of the proposed method. We illustrate this method in a GWAS for 20 slaughtering traits and meat quality traits in beef cattle. Heredity (2014) 113, 526-532; doi:10.1038/hdy.2014.57; published online 2 July 2014
INTRODUCTIONWith the advance of high-throughput genotyping technology, the paradigm of mapping quantitative trait locus (QTL) based on the linkage analysis of sparse genetic markers has gradually shifted to genome-wide association studies (GWAS) based on thousands and thousands of single-nucleotide polymorphisms (SNPs). On the other hand, association studies tend to involve more than one quantitative traits or complex diseases located in different regions of chromosomes, allowing the investigation of common genetic risk factors underlying multiple traits. Although these traits could be analyzed separately with univariate genetic model, statistical methods and algorithms have been developed for simultaneously analyzing multiple