2021
DOI: 10.1101/2021.12.13.472185
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

learnMET: an R package to apply machine learning methods for genomic prediction using multi-environment trial data

Abstract: We introduce the R-package learnMET, developed as a flexible framework to enable a collection of analyses on multi-environment trial (MET) breeding data with machine learning-based models. learnMET allows the combination of genomic information with environmental data such as climate and/or soil characteristics. Notably, the package offers the possibility of incorporating weather data from field weather stations, or can retrieve global meteorological datasets from a NASA database. Daily weather data can be aggr… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 27 publications
0
5
0
Order By: Relevance
“…The statistical modeling approaches for the analyses of plant breeding in multi-environment trials continue evolving as more data with more complex structure are being collected in plant breeding programs (Crossa et al, 2021, Teixeira et al, 2011). For example, diverse set of methods are available for dealing with multi-dimensional data, such as the nonlinear approaches that are now being applied for modeling environmental relatedness using large-scale envirotyping data (Washburn et al, 2021; Rogers et al, 2021; Westhues et al, 2021; Costa-Neto et al, 2021a,b). Here were compared the conventional multi-environment GBLUP (M01, no enviromics) and two reaction-norm GBLUPs, the first using envirotyping data on a conventional linear kernel (M02, linear W-matrix, Jarquin et al, 2014) and the second using these data on a nonlinear Gaussian kernel (M03, nonlinear W-matrix, Costa-Neto et al, 2020b).…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…The statistical modeling approaches for the analyses of plant breeding in multi-environment trials continue evolving as more data with more complex structure are being collected in plant breeding programs (Crossa et al, 2021, Teixeira et al, 2011). For example, diverse set of methods are available for dealing with multi-dimensional data, such as the nonlinear approaches that are now being applied for modeling environmental relatedness using large-scale envirotyping data (Washburn et al, 2021; Rogers et al, 2021; Westhues et al, 2021; Costa-Neto et al, 2021a,b). Here were compared the conventional multi-environment GBLUP (M01, no enviromics) and two reaction-norm GBLUPs, the first using envirotyping data on a conventional linear kernel (M02, linear W-matrix, Jarquin et al, 2014) and the second using these data on a nonlinear Gaussian kernel (M03, nonlinear W-matrix, Costa-Neto et al, 2020b).…”
Section: Discussionmentioning
confidence: 99%
“…Also, environmental covariables have higher collinearity and a lack of orthogonality (Heinemann et al, 2022). Because of this, some studies have added the step of “variable selection” (e.g., Millet et al, 2019; Westhues et al, 2021; Mu et al, 2020), which could help in to overcome this issue but could cost a loss of information when trying to predict a yet-to-be-seen GxE, as will be discussed in further sections of this paper. Here we approached these issues by building non-parametric, nonlinear environmental kernels, that take into account (leveraged) some phenotypic data as prior information to adjust bandwidth factors (Costa-Neto et al, 2021b) and that are able to learn hidden nonlinearities underlying the variations between the observed macro-environmental influence (from envirotyping) and the actual phenotypic variation (the resulted GxE).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…This dataset was balanced across environments and comprised in total 6,716 phenotypic observations. Daily weather data were obtained with the package nasapower from a satellite-based weather system , which was called in the pipeline of the learnMET package (Westhues et al, 2021b). We discarded information related to water stress patterns due to a lack of information regarding the amount and dates of irrigation.…”
Section: Genotypic Phenotypic and Environmental Datasets Usedmentioning
confidence: 99%
“…Our objective was to compare the performance of the DTW-based environmental similarity matrix, that we described in the previous section, with classical approaches that estimate an environmental similarity matrix based on a set of environmental covariates (ECs) related to abiotic stresses calculated over day periods. A total of 11 climatic covariates were considered using the pipeline implemented in the package learnMET (Westhues et al, 2021b) (Table 4.1).…”
Section: Computing Environmental Covariatesmentioning
confidence: 99%