Information about the unknown chemical structure of an organic compound can be obtained by comparing the infrared spectrum with the spectra of a spectral library. The resulting hitlist contains compounds exhibiting the most similar spectra. A method based on the maximum common substructure concept has been developed for an automatic extraction of common structural features from the hitlist structures. A set of substructures is derived that are characteristic for the query structure. Results can be used as structural restrictions in isomer generation.
A substructure isomorphism matrix n x p contains binary elements describing which of the given p query structures (substructures) are part of the given n target structures (molecular structures). Such a matrix can be used to investigate the diversity of the target structures and allows the characterization and comparison of structural libraries. A quadratic substructure isomorphism matrix n x n is obtained if the same structures are used as molecular structures and as substructures; this matrix contains full information about the topological hierarchy of the n structures. A hierarchical arrangement of chemical structures is useful for the evaluation of results obtained from searches in structure databases.
Comparing the infrared spectrum of a compound whose chemical structure is unknown with the spectra of a library is a routinely used method to obtain information about the unknown structure. The resulting hitlist contains compounds exhibiting the most similar spectra. If the unknown is not contained in the library, a method based on the maximum common substructure concept can be applied to extract common structural features from the hitlist structures. The result is a set of substructures that are characteristic for the query structure. This approach has been applied to infrared spectra from a series of model compounds and has been compared with information obtained from mass spectra by the same procedure. A complementary chemometric method for evaluating spectral hitlists is principal component analysis of spectral and structural data. q
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.