We present ABCD, an integrated drug discovery informatics platform developed at Johnson & Johnson Pharmaceutical Research & Development, L.L.C. ABCD is an attempt to bridge multiple continents, data systems, and cultures using modern information technology and to provide scientists with tools that allow them to analyze multifactorial SAR and make informed, data-driven decisions. The system consists of three major components: (1) a data warehouse, which combines data from multiple chemical and pharmacological transactional databases, designed for supreme query performance; (2) a state-of-the-art application suite, which facilitates data upload, retrieval, mining, and reporting, and (3) a workspace, which facilitates collaboration and data sharing by allowing users to share queries, templates, results, and reports across project teams, campuses, and other organizational units. Chemical intelligence, performance, and analytical sophistication lie at the heart of the new system, which was developed entirely in-house. ABCD is used routinely by more than 1000 scientists around the world and is rapidly expanding into other functional areas within the J&J organization.
Efficient substructure searching is a key requirement for any chemical information management system. In this paper, we describe the substructure search capabilities of ABCD, an integrated drug discovery informatics platform developed at Johnson & Johnson Pharmaceutical Research & Development, L.L.C. The solution consists of several algorithmic components: 1) a pattern mapping algorithm for solving the subgraph isomorphism problem, 2) an indexing scheme that enables very fast substructure searches on large structure files, 3) the incorporation of that indexing scheme into an Oracle cartridge to enable querying large relational databases through SQL, and 4) a cost estimation scheme that allows the Oracle cost-based optimizer to generate a good execution plan when a substructure search is combined with additional constraints in a single SQL query. The algorithm was tested on a public database comprising nearly 1 million molecules using 4,629 substructure queries, the vast majority of which were submitted by discovery scientists over the last 2.5 years of user acceptance testing of ABCD. 80.7% of these queries were completed in less than a second and 96.8% in less than ten seconds on a single CPU, while on eight processing cores these numbers increased to 93.2% and 99.7%, respectively. The slower queries involved extremely generic patterns that returned the entire database as screening hits and required extensive atom-by-atom verification.
Drug discovery is a highly complex process requiring scientists from wide-ranging disciplines to work together in a well-coordinated and streamlined fashion. While the process can be compartmentalized into well-defined functional domains, the success of the entire enterprise rests on the ability to exchange data conveniently between these domains, and integrate it in meaningful ways to support the design, execution and interpretation of experiments aimed at optimizing the efficacy and safety of new drugs. This, in turn, requires information management systems that can support many different types of scientific technologies generating data of imposing complexity, diversity and volume. Here, we describe the key components of our Advanced Biological and Chemical Discovery (ABCD), a software platform designed at Johnson & Johnson to bring coherence in the way discovery data is collected, annotated, organized, integrated, mined and visualized. Unlike the Gordian knot of one-off solutions built to serve a single purpose for a single set of users that one typically encounters in the pharmaceutical industry, we sought to develop a framework that could be extended and leveraged across different application domains, and offer a consistent user experience marked by superior performance and usability. In this work, several major components of ABCD are highlighted, ranging from operational subsystems for managing reagents, reactions, compounds, and assays, to advanced data mining and visualization tools for SAR analysis and interpretation. All these capabilities are delivered through a common application front-end called Third Dimension Explorer (3DX), a modular, multifunctional and extensible platform designed to be the "Swiss-army knife" of the discovery scientist.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.