Genome-scale metabolic networks have been reconstructed for several organisms. These metabolic networks provide detailed information about the metabolism inside the cells, coupled with the genomic, proteomic and thermodynamic information. These networks are widely simulated using 'constraint-based' modelling techniques and find applications ranging from strain improvement for metabolic engineering to prediction of drug targets in pathogenic organisms. Components of these metabolic networks are represented in multiple file formats and also using different markup languages, with varying levels of annotations; this leads to inconsistencies and increases the complexities in comparing and analysing reconstructions on multiple platforms. In this work, we critically examine nearly 100 published genome-scale metabolic networks and their corresponding constraint-based models and discuss various issues with respect to model quality. One of the major concerns is the lack of annotations using standard identifiers that can uniquely describe several components such as metabolites, genes, proteins and reactions. We also find that many models do not have complete information regarding constraints on reactions fluxes and objective functions for carrying out simulations. Overall, our analysis highlights the need for a widely acceptable standard for representing constraint-based models. A rigorous standard can help in streamlining the process of reconstruction and improve the quality of reconstructed metabolic models.
Exhaustive identification of all possible alternate pathways that exist in metabolic networks can provide valuable insights into cellular metabolism. With the growing number of metabolic reconstructions, there is a need for an efficient method to enumerate pathways, which can also scale well to large metabolic networks, such as those corresponding to microbial communities. We developed MetQuest, an efficient graph-theoretic algorithm to enumerate all possible pathways of a particular size between a given set of source and target molecules. Our algorithm employs a guided breadth-first search to identify all feasible reactions based on the availability of the precursor molecules, followed by a novel dynamic-programming based enumeration, which assembles these reactions into pathways of a specified size producing the target from the source. We demonstrate several interesting applications of our algorithm, ranging from identifying amino acid biosynthesis pathways to identifying the most diverse pathways involved in degradation of complex molecules. We also illustrate the scalability of our algorithm, by studying large graphs such as those corresponding to microbial communities, and identify several metabolic interactions happening therein. MetQuest is available as a Python package, and the source codes can be found at https://github.com/RamanLab/metquest.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.