In the context of the Human Toxome project, mass-spectroscopy-based metabolomics characterization of estrogen-stimulated MCF-7 cells was studied in order to support the untargeted deduction of pathways of toxicity. A targeted and untargeted approach using over-representation analysis (ORA), quantitative enrichment analysis (QEA), and pathway analysis (PA) and a metabolite network approach were compared. Any untargeted approach necessarily has some noise in the data owing to artifacts, outliers, and misidentified metabolites. Depending on the chemical analytical choices (sample extraction, chromatography, instrument and settings etc.) only a partial representation of all metabolites will be achieved, biased by both the analytical methods and the database used to identify the metabolites. Here, we show on the one hand that using a data analysis approach based exclusively on pathway annotations has the potential to miss much that is of interest and, in the case of misidentified metabolites, can produce perturbed pathways that are statistically significant yet uninformative for the biological sample at hand. On the other hand, a targeted approach, by narrowing its focus and minimizing (but not eliminating) misidentifications, renders the likelihood of a spurious pathway much smaller, but the limited number of metabolites also makes statistical significance harder to achieve.
To avoid an analysis dependent on pathways, we built a de novo network using all metabolites that were different at 24 hours with and without estrogen with a p-value less than .01 (53) in the STITCH database, which links metabolites based on known reactions in the main metabolic network pathways but also based on experimental evidence and text-mining. The resulting network contained a “connected component” of 43 metabolites, and helped identify non-endogenous metabolites as well as pathways not visible by annotation based approaches. Moreover, the most highly connected metabolites (energy metabolites such as pyruvate and alpha-ketoglutarate, as well as amino acids) showed only a modest change between proliferation with and without estrogen.
Here, we demonstrate that estrogen has subtle but potentially phenotypically important alterations in the acyl-carnitine fatty acids and acetyl-putrescine and succinoadenosine, in addition to likely subtle changes in key energy metabolites that, however, could not be verified consistently given the technical limitations of this approach. Finally, we show that a network-based approach combined with text-mining identifies pathways that would otherwise neither be considered statistically significant on their own nor be identified via ORA, QEA, or PA.