Metabolomics, the large-scale study of the metabolic complement of the cell [1][2][3] , is a mature science that has been practiced for over 20 years 4 . Indeed, it is now a commonly used experimental systems biology tool with demonstrated utility in both fundamental and applied aspects of plant, microbial and mammalian research [5][6][7][8][9][10][11][12][13][14][15] . Among the many thousands of studies published in this area over the last 20 years, notable highlights [5][6][7][8]10,11,16 are briefly described in Supplementary Note 1.Despite the insight afforded by such studies, the nature of metabolites, particularly their diversity (in both chemical structure and dynamic range of abundance 9,12 ), remains a major challenge with regard to the ability to provide adequate coverage of the metabolome that can complement that achieved for the genome, transcriptome and proteome. Despite these comparative limitations, enormous advances have been made with regard to the number of analytes about which accurate quantitative information can be acquired, and a vast number of studies have yielded important biological information and biologically active metabolites across the kingdoms of life 14 . We have previously estimated that upwards of 1 million different metabolites occur across the tree of life, with between 1,000 and 40,000 estimated to occur in a single species 4 .
Metabolic genome-wide association studies (mGWAS), whereupon metabolite levels are regarded as traits, can help unravel the genetic basis of metabolic networks. A total of 309 Arabidopsis accessions were grown under two independent environmental conditions (control and stress) and subjected to untargeted LC-MS-based metabolomic profiling; levels of the obtained hydrophilic metabolites were used in GWAS. Our two-condition-based GWAS for more than 3000 semi-polar metabolites resulted in the detection of 123 highly resolved metabolite quantitative trait loci (p ≤ 1.0E-08), 24.39% of which were environment-specific. Interestingly, differently from natural variation in Arabidopsis primary metabolites, which tends to be controlled by a large number of small-effect loci, we found several major large-effect loci alongside a vast number of small-effect loci controlling variation of secondary metabolites. The two-condition-based GWAS was followed by integration with network-derived metabolite-transcript correlations using a time-course stress experiment. Through this integrative approach, we selected 70 key candidate associations between structural genes and metabolites, and experimentally validated eight novel associations, two of them showing differential genetic regulation in the two environments studied. We demonstrate the power of combining large-scale untargeted metabolomics-based GWAS with time-course-derived networks both performed under different abiotic environments for identifying metabolite-gene associations, providing novel global insights into the metabolic landscape of Arabidopsis.
Plant primary metabolism is a highly coordinated, central, and complex network of biochemical processes regulated at both the genetic and post-translational levels. The genetic basis of this network can be explored by analyzing the metabolic composition of genetically diverse genotypes in a given plant species. Here, we report an integrative strategy combining quantitative genetic mapping and metabolite‒transcript correlation networks to identify functional associations between genes and primary metabolites in Arabidopsis thaliana. Genome-wide association study (GWAS) was used to identify metabolic quantitative trait loci (mQTL). Correlation networks built using metabolite and transcript data derived from a previously published time-course stress study yielded metabolite‒transcript correlations identified by covariation. Finally, results obtained in this study were compared with mQTL previously described. We applied a statistical framework to test and compare the performance of different single methods (network approach and quantitative genetics methods, representing the two orthogonal approaches combined in our strategy) with that of the combined strategy. We show that the combined strategy has improved performance manifested by increased sensitivity and accuracy. This combined strategy allowed the identification of 92 candidate associations between structural genes and primary metabolites, which not only included previously well-characterized gene‒metabolite associations, but also revealed novel associations. Using loss-of-function mutants, we validated two of the novel associations with genes involved in tyrosine degradation and in β-alanine metabolism. In conclusion, we demonstrate that applying our integrative strategy to the largely untapped resource of metabolite–transcript associations can facilitate the discovery of novel metabolite-related genes. This integrative strategy is not limited to A. thaliana, but generally applicable to other plant species.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.