We present a new algorithm to canonize molecular graphs using the signature molecular descriptor introduced in the previous papers of this series. While developed specifically for molecular structures, the algorithm can be used for any graph and is not limited to acyclic graphs, planar graphs, bounded valence, or bounded genus graphs, for which polynomial time algorithms exist. The algorithm is tested with benzenoid hydrocarbons and a database of 126,705 organic compounds. The algorithm's performances are compared against Brendan Mc Kay's Nauty algorithm, which is believed to be the fastest graph canonization algorithm for general graphs, with five series of graphs each comprising up to 30,000 vertices: 2D meshes (pericondensed benzenoids), 3D cages (fullerenes and nanotubes), 3D meshes (crystal lattices), 4D cages, and power law graphs (protein and gene networks). The algorithm can be downloaded as an open source code at http://www.cs.sandia.gov/ approximately jfaulon/QSAR.
Background: How enzymes evolved to their present form is linked to how extant metabolic pathways emerged. Results: Chemical diversity of reactions parallels enzyme phylogenetic diversity across the tree of life. Conclusion: Enzyme promiscuity plays a prominent role in the evolution of metabolic networks. Significance: Learning about the mechanisms of enzyme evolution might assist us with the identification of primeval catalytic functions and minimal metabolism.
The graph isomorphism problem belongs to the class of NP problems,
and has been conjectured intractable,
although probably not NP-complete. However, in the context of
chemistry, because molecules are a restricted
class of graphs, the problem of graph isomorphism can be solved
efficiently (i.e., in polynomial-time).
This paper presents the theoretical results that for all
molecules, the problems of isomorphism, automorphism
partitioning, and canonical labeling are polynomial-time problems.
Simple polynomial-time algorithms are
also given for planar molecular graphs and used for automorphism
partitioning of paraffins, polycyclic
aromatic hydrocarbons (PAHs), fullerenes, and nanotubes.
Optimization of biological networks is often limited by wet lab labor and cost, and the lack of convenient computational tools. Here, we describe METIS, a versatile active machine learning workflow with a simple online interface for the data-driven optimization of biological targets with minimal experiments. We demonstrate our workflow for various applications, including cell-free transcription and translation, genetic circuits, and a 27-variable synthetic CO2-fixation cycle (CETCH cycle), improving these systems between one and two orders of magnitude. For the CETCH cycle, we explore 1025 conditions with only 1,000 experiments to yield the most efficient CO2-fixation cascade described to date. Beyond optimization, our workflow also quantifies the relative importance of individual factors to the performance of a system identifying unknown interactions and bottlenecks. Overall, our workflow opens the way for convenient optimization and prototyping of genetic and metabolic networks with customizable adjustments according to user experience, experimental setup, and laboratory facilities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.