2019
DOI: 10.3168/jds.2019-16295
|View full text |Cite
|
Sign up to set email alerts
|

Comparing regression, naive Bayes, and random forest methods in the prediction of individual survival to second lactation in Holstein cattle

Abstract: In this study, we compared multiple logistic regression, a linear method, to naive Bayes and random forest, 2 nonlinear machine-learning methods. We used all 3 methods to predict individual survival to second lactation in dairy heifers. The data set used for prediction contained 6,847 heifers born between January 2012 and June 2013, and had known survival outcomes. Each animal had 50 genomic estimated breeding values available at birth and up to 65 phenotypic variables that accumulated over time. Survival was … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

2
30
0
2

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 54 publications
(34 citation statements)
references
References 40 publications
2
30
0
2
Order By: Relevance
“…In practice, these SNP effects are unknown and may not even strictly follow a certain distribution [25]. Unlike these traditional statistical models, machine learning methods do not require these prior assumptions about the genetic architecture of traits and have been applied in GWAS in humans [30] as well as in livestock [27,95]. Especially, Romagnoni et al [30] and Huang et al [24] showed that machine learning based algorithms provide promising prediction power to assess genotype-phenotype associations.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In practice, these SNP effects are unknown and may not even strictly follow a certain distribution [25]. Unlike these traditional statistical models, machine learning methods do not require these prior assumptions about the genetic architecture of traits and have been applied in GWAS in humans [30] as well as in livestock [27,95]. Especially, Romagnoni et al [30] and Huang et al [24] showed that machine learning based algorithms provide promising prediction power to assess genotype-phenotype associations.…”
Section: Discussionmentioning
confidence: 99%
“…To overcome these limitations of GWAS, application of Bayesian frameworks as well as machine learning algorithms have gained importance in the last decade [21][22][23][24][25]. Their comparative performance has been evaluated for a variety of traits with different genetic architectures (see the reviews [13,26,27]). Nevertheless, multiple studies have revealed that machine learning algorithms surpass currently available well-known GWAS approaches in identifying genes having small effects on the phenotype [28][29][30].…”
Section: Introductionmentioning
confidence: 99%
“…RF are fast and easy to implement, yields highly accurate predictions and can handle a very large number of input variables without overfitting [49]. This method has been implemented in many different fields in recent years [50] including crop pest and disease prediction. Ayub [51] applied several data mining techniques such as random forest, support vector machine, neural network, k-nearest neighbors, decision tree, and Gaussian naïve bayes to predict grass grub damage and indicated that neural network and random forest performed slightly better than other classifiers.…”
Section: Introductionmentioning
confidence: 99%
“…Multiple studies have confirmed the superiority of machine learning algorithms compared to GWAS approaches by identifying genes having small effects on the phenotype [ 26 , 29 , 30 ]. Machine learning methods do not require prior assumptions about the distribution of the SNP effects, hence can be used for a wide variety of traits in humans [ 31 ], plants [ 28 ] and livestock [ 32 , 33 ]. In particular, Random Forests (RF) models have been praised for their ability to analyze a large number of loci simultaneously and to identify promising associations [ 29 , 30 ].…”
Section: Introductionmentioning
confidence: 99%