A reaction center is the part of a chemical reaction that undergoes changes, the heart of the chemical reaction. The reaction atom–atom mapping indicates which reactant atom becomes which product atom during the reaction. Automatic reaction mapping and reaction center detection are of great importance in many applications, such as developing chemical and biochemical reaction databases and studying reaction mechanisms. Traditional reaction mapping algorithms are either based on extended‐connectivity or maximum common substructure (MCS) algorithms. With the development of several biochemical reaction databases (such as KEGG database) and increasing interest in studying metabolic pathways in recent years, several novel reaction mapping algorithms have been developed to serve the new needs. Most of the new algorithms are optimization based, designed to find optimal mappings with the minimum number of broken and formed bonds. Some algorithms also incorporate the chemical knowledge into the searching process in the form of bond weights. Some new algorithms showed better accuracy and performance than the MCS‐based method. WIREs Comput Mol Sci 2013, 3:560–593. doi: 10.1002/wcms.1140 This article is categorized under: Computer and Information Science > Chemoinformatics
The wide application of next-generation sequencing has presented a new hurdle to bioinformatics for managing the fast-growing sequence data. The management of biomacromolecules at the chemistry level imposes an even greater challenge in cheminformatics because of the lack of a good chemical representation of biopolymers. Here we introduce the self-contained sequence representation (SCSR). SCSR combines the best features of bioinformatics and cheminformatics notations. SCSR is the first general, extensible, and comprehensive representation of biopolymers in a compressed format that retains chemistry detail. The SCSR-based high-performance exact structure and substructure searching methods (NEMA key and SSS) offer new ways to search biopolymers that complement bioinformatics approaches. The widely used chemical structure file format (molfile) has been enhanced to support SCSR. SCSR offers a solid framework for future development of new methods and systems for managing and handling sequences at the chemistry level. SCSR lays the foundation for the integration of bioinformatics and cheminformatics.
Biomolecules present challenges to chemical information systems designed for small molecules. Their sizes, up to tens of thousands of atoms, overwhelm representation/ storage/searching solutions built on explicit chemical representation of the structures. But biomolecules are largely made up of many repeats of a limited number of building-block molecules, a fact which has been used to provide a compressed representation for biomolecules using templates for the building blocks.We have adopted a modified template-based representation for biomolecules. Our primary interest is in the chemically modified portions of biomolecules, for which we choose to use explicit chemistry. These areas of explicit chemistry are then embedded in the templatecompressed, unmodified portions of the full biomolecule.The regions containing explicit chemistry are indexed, and thus can be structure searched with good performance. A limited number of residues surrounding explicit chemistry regions are included in the index for searching the context of these explicit regions. By using explicit chemistry to represent modified regions we can search across classes of modifications for common features. For example a single substructure search query will find green fluorescent protein, and its histidine, phenylalanine and tryptophan analogs.Templates are stored with the structure providing a self-contained file format. The use of NEMA keys allows templates from different structures to be compared, and allows storage of structures containing a canonical list of templates. The residues have defined attachment points, allowing automated traversal of a protein backbone, or location of non-backbone bonds to residues.We will present example structures and structural queries highlighting capabilities of our representation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.