2022
DOI: 10.3389/frai.2022.876578
|View full text |Cite
|
Sign up to set email alerts
|

Machine Learning Applied to the Search for Nonlinear Features in Breeding Populations

Abstract: Large plant breeding populations are traditionally a source of novel allelic diversity and are at the core of selection efforts for elite material. Finding rare diversity requires a deep understanding of biological interactions between the genetic makeup of one genotype and its environmental conditions. Most modern breeding programs still rely on linear regression models to solve this problem, generalizing the complex genotype by phenotype interactions through manually constructed linear features. However, the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 41 publications
0
3
0
Order By: Relevance
“…A recent review by Montesinos-López et al compared 23 independent studies on linear and nonlinear prediction performance, indicating that nonlinear models outperformed linear ones in 47% of the studies when considering gene-environment interactions (G × E) and in 56% when ignoring G × E interactions 36 . Another study by Gabur et al using real plant breeding program data demonstrated that Machine Learning methods have the potential to outperform current approaches, increasing prediction accuracies, drastically reducing computing time, and improving the detection of important alleles involved in qualitative or quantitative traits 37 . Traditional univariate and multivariate statistics have limited efficiency in analyzing data affected by the complex interactions between genotypes and environments (G × E).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…A recent review by Montesinos-López et al compared 23 independent studies on linear and nonlinear prediction performance, indicating that nonlinear models outperformed linear ones in 47% of the studies when considering gene-environment interactions (G × E) and in 56% when ignoring G × E interactions 36 . Another study by Gabur et al using real plant breeding program data demonstrated that Machine Learning methods have the potential to outperform current approaches, increasing prediction accuracies, drastically reducing computing time, and improving the detection of important alleles involved in qualitative or quantitative traits 37 . Traditional univariate and multivariate statistics have limited efficiency in analyzing data affected by the complex interactions between genotypes and environments (G × E).…”
Section: Discussionmentioning
confidence: 99%
“…Modern biological data, encompassing genomic sequence analysis, SNP chip arrays, and hyperspectral phenomics, often involves high dimensionality, necessitating effective tools to understand the underlying genetic mechanisms and identify patterns associated with specific traits 31 . A profound comprehension of the biological interactions between the genetic makeup of a genotype and its environmental conditions is vital in understanding rare diversity 37 . Machine learning models have proven highly valuable, particularly in dealing with large heterogeneous datasets frequently encountered in plant breeding populations 39 .…”
Section: Discussionmentioning
confidence: 99%
“…Recent studies have started to investigate the potential of ML for tasks related to plant breeding. ML has been used for handling genotype-by-environment interactions in multi-environmental trials (Montesinos-López et al, 2018b;Gillberg et al, 2019;Washburn et al, 2021;Westhues et al, 2021), the identification of the optimal set of markers used for prediction (Li et al, 2018a;Gabur et al, 2022), phenomic prediction and image classification (Mohanty et al, 2016;Nagasubramanian et al, 2018;Cuevas et al, 2019;Nagasubramanian et al, 2019) as well as genomic prediction (Ma et al, 2018;Azodi et al, 2019;Banerjee et al, 2020;Montesinos-López et al, 2021). The majority of recently published studies rely on genomic data as the basis of their predictions.…”
Section: Introductionmentioning
confidence: 99%