A large set of organic compounds extracted from the CAS Registry is analyzed to study recent changes in structural diversity. The diversity is characterized using the framework content of the compounds; the framework of a molecule is the scaffold consisting of all its ring systems and all the chain fragments connecting them. The compounds are partitioned based on their year of first report in the literature, which allows framework occurrence frequencies to be compared across a 10-year interval. The results are consistent with a process in which frameworks with the greatest frequency of use in the past are the most likely to be used again, but it is also found that the frequency ordering changes over time. These fluctuations in ordering are attributed to stochastic factors, scientific and economic, that can affect how chemical space is explored. Framework diversity is found to have increased over time despite the extensive reuse of a relatively small number of frameworks; this increase is due to the large number of new frameworks. The long tail of the framework distribution, composed of frameworks that occur in few compounds or only one compound, is found to be a large and growing part of framework space.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.