SUMMARYA library search system for 13C-n.m.r. spectra, based on a statistical description of the reproducibility of chemical shifts, is presented. A similarity index in the form of a significance probability (P-value) is developed from a previously introduced general concept. The applied data base of some 6000 spectra originates from the NIH-EPA Chemical Information System (CIS). The reproducibility model and the retrieval system are developed on a CDC Cyber 175 computer, with PASCAL as the programming language. The performance of the system is evaluated by using recall/reliability and confusion/ recall plots. In a tentative comparison with the Clerc search method (included in the CIS package), the Utrecht *fC-n.m.r. reproducibility-based retrieval system shows a better identification performance. The system is adaptable for use on a microcomputer.In recent years, several systems have been proposed for computer-aided library search of 13C-n.m.r. spectra, aiming at the identification of organic compounds and based on various coding and comparison algorithms [l-13]. Collections of several thousands of spectra, from different sources (i.e., recorded on different instruments, under various experimental conditions) are usually applied as data bases. Such data bases, which are often commercially available, generally suffer from poor interlaboratory reproducibility of the spectral data involved. In a previous paper [14], a new similarity index for "straightforward" library search methods was introduced, primarily for application to this type of data base. The main object of straightforward search methods is to retrieve the reference data (if available) of the unknown compound. This contrasts with "interpretative" methods, which aim principally at retrieving data of compounds similar in structure to the unknown. The proposed index has the form of a significance probability and is developed from a statistical model of the reproducibility of the quantities used for the comparison of unknown and reference data. Defined in general terms, this index is applicable to different types of analytical data, provided that the variables used to characterize unknown and reference spectra (feature quantities) are continuous in nature. In 13C!-n.m.r. spectra, the chemical shift and the peak intensity are such variables, in contrast to multiplicities. When a large multisource data base such as the CIS collection is used, how-
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.