Both gas chromatography-mass spectrometry (GC-MS) and liquid chromatographymass spectrometry (LC-MS) are widely used metabolomics approaches to detect and quantify hundreds of thousands of metabolite features. However, the application of these techniques to a large number of samples is subject to more complex interactions, particularly for genome-wide association studies (GWAS). This protocol describes an optimized metabolic workflow, which combines an efficient and fast sample preparation with the analysis of a large number of samples for legume crop species. This slightly modified extraction method was initially developed for the analysis of plant and animal tissues and is based on extraction in methyl tert-butyl ether: methanol solvent to allow the capture of polar and lipid metabolites. In addition, we provide a step-by-step guide for reducing analytical variations, which are essential for the highthroughput evaluation of metabolic variance in GWAS.
IntroductionLarge-scale "omics" approaches have enabled the analysis of complex biological systems 1 , 2 , 3 and further understanding of the link between genotypes and the resulting phenotypes 4 . Metabolomics using ultra-high-performance liquid chromatography-mass spectrometry (UHPLC-MS) and GC-MS enabled the detection of a plethora of metabolite features, of which only some are annotated to a certain degree, resulting in a high proportion of unknown metabolites.Complex interactions can be explored by combining largescale metabolomics with the underlying genotypic variation of a diverse population 5 . However, handling large sample sets is inherently associated with analytical variations, distorting the evaluation of metabolic variance for further downstream processes. Specifically, major issues leading to analytical variations are based on machine performance and instrumental drift over time 6 . The integration of batch-to-batch