Advances in high-throughput sequencing technologies have reduced the cost of genotyping dramatically and led to genomic prediction being widely used in animal and plant breeding, and increasingly in human genetics. Inspired by the efficient computing of linear mixed model and the accurate prediction of Bayesian methods, we propose a machine learning-based method incorporating cross-validation, multiple regression, grid search, and bisection algorithms named KAML that aims to combine the advantages of prediction accuracy with computing efficiency. KAML exhibits higher prediction accuracy than existing methods, and it is available at https://github.com/YinLiLin/KAML.
SummaryPigs are one of the earliest domesticated animals and multiple breeds have been developed to meet the various demands of consumers. EigenGWAS is a novel strategy to identify candidate genes that underlying population genetic differences and to infer candidate regions under selection as well. In this study, EigenGWAS and Fst analyses were performed using the public re‐sequencing data of three typical commercial pig breeds, Duroc, Landrace and Yorkshire. The intersection of genome‐wide significant SNPs detected by EigenGWAS and top‐ranked 1% SNPs of Fst results were treated as signals under selection. Using the data of all three breeds, 3062 signals under selection were detected and the nearby genomic regions within 300 kb upstream and downstream covered 6.54% of whole genome. Pairs of breeds were analysed along with the pathway analysis. The gene function enrichment results indicated that many candidate genes located in the genomic regions of the signals under selection were associated with biological processes related to growth, metabolism, reproduction, sensory perception, etc. Among the candidate genes, the FSHB, AHR, PTHLH, KDR and FST genes were reported to be associated with reproductive performance; the KIT, KITLG, MITF, MC1R and EDNRB genes were previously identified to affect coat colour; the RETREG1, TXNIP, BMP5, PPARD and RBP4 genes were reported to be associated with lipid metabolism and growth traits. The identified genetic differences across the three commercial breeds will advance understanding of the artificial selection history of pigs and the signals under selection will suggest potential uses in pig genomic breeding programmes.
Human diseases and agricultural traits can be predicted by modeling a genetic random polygenic effect in linear mixed models. To estimate variance components and predict random effects of the model efficiently with limited computational resources has always been of primary concern, especially when it involves increasing the genotype data scale in the current genomic era. Here, we thoroughly reviewed the development history of statistical algorithms used in genetic evaluation and theoretically compared their computational complexity and applicability for different data scenarios. Most importantly, we presented a computationally efficient, functionally enriched, multi-platform and user-friendly software package named ‘HIBLUP’ to address the challenges that are faced currently using big genomic data. Powered by advanced algorithms, elaborate design and efficient programming, HIBLUP computed fastest while using the lowest memory in analyses, and the greater the number of individuals that are genotyped, the greater the computational benefits from HIBLUP. We also demonstrated that HIBLUP is the only tool which can accomplish the analyses for a UK Biobank-scale dataset within 1 h using the proposed efficient ‘HE + PCG’ strategy. It is foreseeable that HIBLUP will facilitate genetic research for human, plants and animals. The HIBLUP software and user manual can be accessed freely at https://www.hiblup.com.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.