2020
DOI: 10.1186/s13059-020-02052-w
|View full text |Cite
|
Sign up to set email alerts
|

KAML: improving genomic prediction accuracy of complex traits using machine learning determined parameters

Abstract: Advances in high-throughput sequencing technologies have reduced the cost of genotyping dramatically and led to genomic prediction being widely used in animal and plant breeding, and increasingly in human genetics. Inspired by the efficient computing of linear mixed model and the accurate prediction of Bayesian methods, we propose a machine learning-based method incorporating cross-validation, multiple regression, grid search, and bisection algorithms named KAML that aims to combine the advantages of predictio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
50
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
2
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 65 publications
(53 citation statements)
references
References 53 publications
2
50
0
Order By: Relevance
“…In each year's data, both are ISR performs best, and followed BayesA, BayesB, BayesLASSO, and rrBLUP, BSLMM, BayesC, and DPR. 19,20 . In contrast, rrBLUP is the faster method, while DPR, BayesR, and BSLMM are as same as computationally efficient.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…In each year's data, both are ISR performs best, and followed BayesA, BayesB, BayesLASSO, and rrBLUP, BSLMM, BayesC, and DPR. 19,20 . In contrast, rrBLUP is the faster method, while DPR, BayesR, and BSLMM are as same as computationally efficient.…”
Section: Resultsmentioning
confidence: 99%
“…Therefore, in recent years, researchers have regarded phenotype prediction as a critical step in joint functional genomics and genome-wide research 10,18 . However, with the growth of high-throughput genomics data, accurate phenotype prediction requires the development of statistical methods that can simulate all or majors SNPs simultaneously 9,19,20 . Moreover, previous genome-wide association analysis studies have shown that many complex trait phenotypes and diseases have a polygenic genetic background, mainly controlled by many genetic variation sites with smaller effects.…”
Section: Introductionmentioning
confidence: 99%
“…The prediction accuracy of an optimal marker set depends on how well it reflects the characteristics of the markers involved in a phenotype; thus, it is important to construct a marker-set with appropriate markers 14 . In this respect, many studies have adopted approaches that either directly exclude uninformative markers or assign weights to markers according to their contributions in a large set of markers [31][32][33] . These approaches have contributed to improving the accuracy of the www.nature.com/scientificreports/ GP, but simultaneously, it is difficult to select the appropriate markers to be excluded or the weight values to be assigned.…”
Section: Discussionmentioning
confidence: 99%
“…These approaches have contributed to improving the accuracy of the www.nature.com/scientificreports/ GP, but simultaneously, it is difficult to select the appropriate markers to be excluded or the weight values to be assigned. In particular, when these approaches are based on GWAS, obtaining robust weights is problematic due to marker effects or p-values being calculated differently according to the GWAS methods 31 . Moreover, these approaches often require an amount of computation if they conduct modeling based on the large marker set.…”
Section: Discussionmentioning
confidence: 99%
“…Because the dataset that we used was not the original data, all the phenotype data had been standardized (mean=0, standard deviation=1). More details about the original dataset can refer to effect loci and many small effect loci (MY) and (3) many loci with small effects (SCS), respectively [21,22].…”
Section: German Holstein Dairy Cattle Datasetmentioning
confidence: 99%