Reaction-based de novo design is the computational generation of novel molecular structures by linking building blocks using reaction vectors derived from chemistry knowledge. In this work, we first adopted a recurrent neural network (RNN) model to generate three groups of building blocks with different functional groups and then constructed an in silico target-focused combinatorial library based on chemical reaction rules. Mer tyrosine kinase (MERTK) was used as a study case. Combined with a scaffold enrichment analysis, 15 novel MERTK inhibitors covering four scaffolds were achieved. Among them, compound 5a obtained an IC50 value of 53.4 nM against MERTK without any further optimization. The efficiency of hit identification could be significantly improved by shrinking the compound library with the fragment iterative optimization strategy and enriching the dominant scaffold in the hinge region. We hope that this strategy can provide new insights for accelerating the drug discovery process.
Protein kinases are important drug targets for the treatment of several diseases. The interaction between kinases and ligands is vital in the process of small-molecule kinase inhibitor (SMKI) design. In this study, we propose a method to extract fragments and amino acid residues from crystal structures for kinase–ligand interactions. In addition, core fragments that interact with the important hinge region of kinases were extracted along with their decorations. Based on the superimposed structural data of kinases from the kinase–ligand interaction fingerprint and structure database, we obtained two libraries, namely, a hinge-unfocused fragment–amino acid pair library (FAP Lib) that contains 6672 pairs of fragments and corresponding amino-acids, and a hinge-focused hinge binder library (HB Lib) of 3560 pairs of hinge-binding scaffolds with their corresponding decorations. These two libraries constitute a kinase-focused interaction database (KID). In depth analysis was conducted on KID to explore important characteristics of fragments in the design of SMKIs. With KID, we built two kinase-focused molecule databases, one called Recomb_DB, which contains 1,72,346 molecules generated through fragment recombination based on the FAP Lib, and another called RsdHB_DB, which contains 93,030 molecules generated based on our HB Lib using molecular generation methods. Compared with five databases both commercial and non-commercial, these two databases both ranked top 3 in scaffold diversity, top 4 in molecule fingerprint diversity, and are more focused on the chemical space of kinase inhibitors. Hence, KID presents a useful addition to existing databases for the exploration of novel SMKIs.
The cyclin-dependent protein kinases (CDKs) are protein-serine/threonine kinases with crucial effects on the regulation of cell cycle and transcription. CDKs can be a hallmark of cancer since their excessive expression could lead to impaired cell proliferation. However, the selectivity profile of most developed CDK inhibitors is not enough, which have hindered the therapeutic use of CDK inhibitors. In this study, we propose a multitask deep learning framework called BiLAT based on SMILES representation for the prediction of the inhibitory activity of molecules on eight CDK subtypes (CDK1, 2, 4−9). The framework is mainly composed of an improved bidirectional long short-term memory module BiLSTM and the encode layer of the Transformer framework. Additionally, the data enhancement method of SMILES enumeration is applied to improve the performance of the model. Compared with baseline predictive models based on three conventional machine learning methods and two multitask deep learning algorithms, BiLAT achieves the best performance with the highest average AUC, ACC, F1-score, and MCC values of 0.938, 0.894, 0.911, and 0.715 for the test set. Moreover, we constructed a targeted external data set CDK-Dec for the CDK family, which mainly contains bait values screened by 3D similarity with active compounds. This dataset was utilized in the subsequent evaluation of our model. It is worth mentioning that the BiLAT model is interpretable and can be used by chemists to design and synthesize compounds with improved activity. To further verify the generalization ability of the multitask BiLAT model, we also conducted another evaluation on three public datasets (Tox21, ClinTox, and SIDER). Compared with several currently popular models, BiLAT shows the best performance on two datasets. These results indicate that BiLAT is an effective tool for accelerating drug discovery.
Improving screening efficiency is one of the most challenging tasks of virtual screening (VS). In this work, we propose an effective target-focused scoring criterion for VS and apply it to the screening of a specific target scaffold replacement library constructed by enumeration of suitable substitution fragments and R-groups of known ligands. This criterion is based on both ligand-and structure-based scoring methods, which includes feature maps, 3D shape similarity, and the pairwise distance information between proteins and ligands (FSDscore). It is precisely due to the hybrid advantages of ligand-and structure-based approaches that FSDscore performs far better on the validation dataset than other scoring methods. We apply FSDscore to the VS of different kinase targets, MERTK (Mer tyrosine kinase) and ABL1 (tyrosine-protein kinase ABL1) in order to avoid occasionality. Finally, a VS case study shows the potential and effectiveness of our scoring criterion in drug discovery and molecular dynamics simulation further verifies its powerful ability.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.