2021
DOI: 10.1101/2021.06.24.449776
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Machine Learning based Genome-Wide Association Studies for Uncovering QTL Underlying Soybean Yield and its Components

Abstract: Genome-wide association study (GWAS) is currently one of the important approaches for discovering quantitative trait loci (QTL) associated with traits of interest. However, insufficient statistical power is the limiting factor in current conventional GWAS methods for characterizing quantitative traits, especially in narrow genetic bases plants such as soybean. In this study, we evaluated the potential use of machine learning (ML) algorithms such as support vector machine (SVR) and random forest (RF) in GWAS, c… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 135 publications
0
5
0
Order By: Relevance
“…The phenotypic evaluations and data collecting process of the tested traits are explained in detail in Yoosefzadeh-Najafabadi, Tulpan and Eskandari [ 5 , 33 ]. In brief, the average yield performance of genotypes in a panel of 250 soybeans ranged between 2.58 to 5.71 ton ha −1 .…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The phenotypic evaluations and data collecting process of the tested traits are explained in detail in Yoosefzadeh-Najafabadi, Tulpan and Eskandari [ 5 , 33 ]. In brief, the average yield performance of genotypes in a panel of 250 soybeans ranged between 2.58 to 5.71 ton ha −1 .…”
Section: Resultsmentioning
confidence: 99%
“…The most dominant strategy to increase the pace of soybean yield improvement is to select the superior genotypes based on their yield performance and its component traits. According to previous studies, soybean yield components such as NP, NRNP, RNP, and PP play important roles in determining the final yield production [ 33 , 35 ]. Yoosefzadeh-Najafabadi, Tulpan and Eskandari [ 5 ] reported that PP and NP had the highest positive correlation with yield, which indicated that manipulating NP could result in significant changes in PP, leading to an increase or decrease in the formation of final soybean seed yield.…”
Section: Discussionmentioning
confidence: 99%
“…As demonstrated in this research, identifying fewer but important predictors yielded higher prediction accuracy as compared to fitting the model with the highest number of predictors available. Yoosefzadeh-Najafabadi et al (2021b) performed SVM, RF, ECMLM, and FarmCPU-based GWAS for soybean yield and its components including the number of reproductive nodes, non-reproductive nodes, total nodes, and total pods per plant. They found SVM to outperform all the other methodologies.…”
Section: Discussionmentioning
confidence: 99%
“…Besides, the applications of ML-based GWAS need to be consistently validated with significant associations that make both biological and statistical sense ( Nicholls et al, 2020 ). To the best of our knowledge, ML-based GWAS has been applied in soybean to identify significant marker-trait associations using SVM ( Yoosefzadeh-Najafabadi et al, 2021a , b ), RF ( Zhou et al, 2019 ; Xavier and Rainey, 2020 ; Yoosefzadeh-Najafabadi et al, 2021b ), and Deep Convolutional Neural Network (CNN) ( Liu et al, 2019 ), of which none was applied on soybean resistance to SRKN. Therefore, the objective of this study was to conduct ML-GWAS utilizing 717 diverse breeding lines derived from 330 unique bi-parental populations with two different algorithms (SVM and RF) to unveil novel regions of the soybean genome regulating the resistance to SRKN (reported as the development of galls in the roots) and contribute to developing enhanced and more durable SRKN resistance.…”
Section: Introductionmentioning
confidence: 99%
“…The effectiveness of using RFE was reported previously by Yoosefzadeh-Najafabadi et al (2021a) to extract the important wavelengths for predicting soybean seed yield. In addition, a few studies used ML algorithms in GWAS for detecting QTL associated with complex traits ( Zhou et al, 2019 ; Xavier and Rainey, 2020 ; Najafabadi et al, 2021 ). In a GWAS study, Xavier and Rainey (2020) investigated the potential use of Random Forest (RF) for detecting QTL associated with soybean yield components, such as the number of pods and nodes.…”
Section: Introductionmentioning
confidence: 99%