Inferring genome-scale metabolic networks in emerging model organisms is challenging because of incomplete biochemical knowledge and incomplete conservation of biochemical pathways during evolution. This limits the possibility to automatically transfer knowledge from well-established model organisms. Therefore, specific bioinformatic tools are necessary to infer new biochemical reactions and new metabolic structures that can be checked experimentally. Using an integrative approach combining both genomic and metabolomic data in the red algal model Chondrus crispus, we show that, even metabolic pathways considered as conserved, like sterol or mycosporine-like amino acids (MAA) synthesis pathways, undergo substantial turnover. This phenomenon, which we formally define as "metabolic pathway drift", is consistent with findings from other areas of evolutionary biology, indicating that a given phenotype can be conserved even if the underlying molecular mechanisms are changing. We present a proof of concept with a new methodological approach to formalize the logical reasoning necessary to infer new reactions and new molecular structures, based on previous biochemical knowledge. We use this approach to infer previously unknown reactions in the sterol and MAA pathways.
Author summaryGenome-scale metabolic models describe our current understanding of all metabolic pathways occuring in a given organism. For emerging model species, where few biochemical data are available about really occurring enzymatic activities, such metabolic models are mainly based on transferring knowledge from other more studied species, based on the assumption that the same genes have the same function in the compared species. However, integration of metabolomic data into genome-scale metabolic models leads to situations where gaps in pathways cannot be filled by known enzymatic reactions from existing databases. This is due to structural variation in metabolic pathways accross evolutionary time. In such cases, it is necessary to use complementary approaches to infer new reactions and new metabolic intermediates using logical reasoning, based on available partial biochemical knowledge.Here we present a proof of concept that this is feasible and leads to hypotheses that are precise enough to be a starting point for new experimental work. the well-studied human pathogenic bacterium Mycobacterium tuberculosis, high throughput metabolomic screens revealed an unexpected diversity of reactions in central carbon metabolism [9]. Evolutionary models have already been developed to explain the arising of new pathways, with most experimental validations being focused so far at the level of individual enzyme activities [10]. The complementary question, how much conserved corresponding to Z-palythenic acid in C. crispus extracts, which does not support dehydration occuring before decarboxylation, as proposed in other species (Fig 4 and [65]). The structure of a new intermediate was therefore inferred manually, leading to MAA2 on Figure 8.Calculating its m/z ratio,...