IntroductionWhile the generation of reference genomes facilitates the elucidation of genephenome associations, reference models of the metabolome that are specific to organism, sample type (e.g. plasma, serum, urine, cell-culture), and state (including disease), remain uncommon. In studying heart disease in humans, a reference model describing the relationships between metabolites in plasma has not been determined but would have great utility as a reference for comparing acute disease states such as myocardial infarction.
Materials and MethodsWe present a methodology for deriving probabilistic models that describe the partial correlation structure of metabolite distributions ("interactomes") from metabolomics data. As determining partial correlation structures requires estimating p*(p-1)/2 parameters for p metabolites, the dimension of the search space for parameter values is immense. Consequently, we have developed a Bayesian methodology for the penalized estimation of model parameters in which the magnitude of penalization is drawn from probability distributions with hyperparameters linked to molecular structure similarity. In our work, structural similarity was determined as the Tanimoto coefficient of algorithmicallygenerated "atom colors" that capture the local structure around each atom within each structure. A Gibbs sampler (a Markov chain Monte Carlo technique) was implemented for simulating the posterior distribution of model parameters. We have made software for implementing this methodology publicly available via the R package BayesianGLasso.
Results / ConclusionsFirst, we demonstrate robust performance of our methodology (sensitivity, specificity, and measures of accuracy) for recovering the true underlying partial correlation structure over simulated datasets (with simulated metabolite abundances and simulated known structural similarity). We then present an interactome model for stable heart disease inferred from non-targeted mass spectrometry data via this methodology. Inspection of the local graph topology about cholate reveals probabilistic interactions with other primary bile acids, secondary bile acids, and many steroid hormones sharing the same precursors.