Chemotypes are a new approach for representing molecules, chemical substructures and patterns, reaction rules, and reactions. Chemotypes are capable of integrating types of information beyond what is possible using current representation methods (e.g., SMARTS patterns) or reaction transformations (e.g., SMIRKS, reaction SMILES). Chemotypes are expressed in the XML-based Chemical Subgraphs and Reactions Markup Language (CSRML), and can be encoded not only with connectivity and topology but also with properties of atoms, bonds, electronic systems, or molecules. CSRML has been developed in parallel with a public set of chemotypes, i.e., the ToxPrint chemotypes, which are designed to provide excellent coverage of environmental, regulatory, and commercial-use chemical space, as well as to represent chemical patterns and properties especially relevant to various toxicity concerns. A software application, ChemoTyper has also been developed and made publicly available in order to enable chemotype searching and fingerprinting against a target structure set. The public ChemoTyper houses the ToxPrint chemotype CSRML dictionary, as well as reference implementation so that the query specifications may be adopted by other chemical structure knowledge systems. The full specifications of the XML-based CSRML standard used to express chemotypes are publicly available to facilitate and encourage the exchange of structural knowledge.
The EC number system for the classification of enzymes uses different criteria such as reaction pattern, the nature of the substrate, the type of transferred groups or the type of acceptor group. These criteria are used with different emphasis for the various enzyme classes and thus do not contribute much to an understanding of the mechanisms of enzyme catalyzed reactions. To explore the reasons for bonds being broken in enzyme catalyzed metabolic reactions, we calculated physicochemical effects for the bonds reacting in the substrate of these enzymatic reactions. These descriptors allow the definition of similarities within these reactions and thus can serve as a method for the classification of enzyme reactions. To foster an understanding of the investigations performed here, we compare the similarities found on the basis of the physicochemical effects with the EC number classification. To allow a reasonable comparison we selected enzymatic reactions where the EC number system is largely built on criteria based on the reaction mechanism. This is true for hydrolysis reactions, falling into the domain of the EC class 3 (EC 3.b.c.d). The comparison is made by a Kohonen neural network based on an unsupervised learning algorithm. For these hydrolysis reactions, the similarity analysis on physicochemical effects produces results that are, by and large, similar to the EC number. However, this similarity analysis reveals finer details of the enzymatic reactions and thus can provide a better basis for the mechanistic comparison of enzymes.
The correct identification of the reacting bonds and atoms is a prerequisite for the analysis of the reaction mechanism. We have recently developed a method based on the Imaginary Transition State Energy Minimization approach for automatically determining the reaction center information and the atom-atom mapping numbers. We test here the accuracy of this ITSE approach by comparing the predictions of the method against more than 1500 manually annotated reactions from BioPath, a comprehensive database of biochemical reactions. The results show high agreement between manually annotated mappings and computational predictions (98.4%), with significant discrepancies in only 24 cases out of 1542 (1.6%). This result validates both the computational prediction and the database, at the same time, as the results of the former agree with expert knowledge and the latter appears largely self-consistent, and consistent with a simple principle. In 10 of the discrepant cases, simple chemical arguments or independent literature studies support the predicted reaction center. In five reaction instances the differences in the automatically and manually annotated mappings are described in detail. Finally, in approximately 200 cases the algorithm finds alternate reaction centers, which need to be studied on a case by case basis, as the exact choice of the alternative may depend on the enzyme catalyzing the reaction.
The Biochemical Pathways Wall Chart (http://www.expasy.org/tools/pathways/ref.1) has been converted into a molecule and reaction database. Major features of this database are that each molecule is represented by lists of all atoms and bonds (as connection tables), and in the reactions the reaction centre, the atoms and bonds directly involved in the bond rearrangement process, are marked. The information in the database has been enriched by a set of diverse 3D structure conformations generated by the programs CORINA and ROTATE. The web-based structure and reaction retrieval system C@ROL provides a wide range of search methods to mine this rich database. The database is accessible at http://www2.chemie.uni-erlangen.de/services/biopath/index.html and http://www.mol-net.de/databases/biopath.html .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.