Our work is composed of a python program for programmatic data mining of PubChem to collect data to implement a machine learning based AutoQSAR algorithm to generate drug leads for the flaviviruses -Dengue and West Nile. The drug leads generated by the program are feed as programmatic inputs to AutoDock Vina package for automated In Silico modelling of interaction between the compounds generated as drug leads by the program and the chosen Dengue and West Nile drug target methyltransferase, whose inhibition leads to the control of viral replication. The machine learning based AutoQSAR algorithm involves feature selection, QSAR modelling, validation and prediction. The drug leads generated each time the program is run is reflective of the constantly growing PubChem database is an important dynamic feature of the program which facilitates fast and dynamic drug lead generation against the West Nile and Dengue virus in way which is reflective of the constantly growing PubChem database. The program prints out the top drug leads after screening PubChem library which is over a billion compounds. The leads generated by the program are fed as programmatic inputs to an In Silico modelling package. The interaction of top drug lead compounds generated by the program and drug targets of West Nile and Dengue virus, was modelled in an automated way through programmatic commands. Thus our program ushers in a new age of automatic ease in the virtual drug screening and drug identification through programmatic data mining of chemical data libraries and drug lead generation through machine learning based AutoQSAR algorithm and an automated In Silico
The past decade has seen a surge in the range of application data science, machine learning, deep learning, and AI methods to drug discovery. The presented work involves an assemblage of a variety of AI methods for drug discovery along with the incorporation of in silico techniques to provide a holistic tool for automated drug discovery. When drug candidates are required to be identified for a particular drug target of interest, the user is required to provide the tool target signatures in the form of an amino acid sequence or its corresponding nucleotide sequence. The tool collects data registered on PubChem required to perform an automated QSAR and with the validated QSAR model, prediction and drug lead generation are carried out. This protocol we call Target2Drug. This is followed by a protocol we call Target2DeNovoDrug wherein novel molecules with likely activity against the target are generated de novo using a generative LSTM model. It is often required in drug discovery that the generated molecules possess certain properties like drug-likeness, and therefore to optimize the generated de novo molecules toward the required drug-like property we use a deep learning model called DeepFMPO, and this protocol we call Target2DeNovoDrugPropMax. This is followed by the fast automated AutoDock-Vina based in silico modeling and profiling of the interaction of optimized drug leads and the drug target. This is followed by an automated execution of the Molecular Dynamics protocol that is also carried out for the complex identified with the best protein-ligand interaction from the AutoDock-Vina based virtual screening. The results are stored in the working folder of the user. The code is maintained, supported, and provide for use in the following GitHub repository
<p>Network data is composed of nodes and edges. Successful application of machine learning/deep learning algorithms on network data to make node classification and link prediction has been shown in the area of social networks through which highly customized suggestions are offered to social network users. Similarly one can attempt the use of machine learning/deep learning algorithms on biological network data to generate predictions of scientific usefulness. In the present work, compound-drug target interaction data set from bindingDB has been used to train machine learning/deep learning algorithms which are used to predict the drug targets for any PubChem compound queried by the user. The user is required to input the PubChem Compound ID (CID) of the compound the user wishes to gain information about its predicted biological activity and the tool outputs the RCSB PDB IDs of the predicted drug target. The tool also incorporates a feature to perform automated <i>In Silico</i> modelling for the compounds and the predicted drug targets to uncover their protein-ligand interaction profiles. The programs fetches the structures of the compound and the predicted drug targets, prepares them for molecular docking using standard AutoDock Scripts that are part of MGLtools and performs molecular docking, protein-ligand interaction profiling of the targets and the compound and stores the visualized results in the working folder of the user. The program is hosted, supported and maintained at the following GitHub repository </p> <p><a href="https://github.com/bengeof/Compound2Drug">https://github.com/bengeof/Compound2Drug</a></p>
Our work is composed of a python program for programmatic data mining of PubChem to collect data to implement a machine learning based AutoQSAR algorithm to generate drug leads for the flaviviruses -Dengue and West Nile. The drug leads generated by the program are feed as programmatic inputs to AutoDock Vina package for automated In Silico modelling of interaction between the compounds generated as drug leads by the program and the chosen Dengue and West Nile drug target methyltransferase, whose inhibition leads to the control of viral replication. The machine learning based AutoQSAR algorithm involves feature selection, QSAR modelling, validation and prediction. The drug leads generated each time the program is run is reflective of the constantly growing PubChem database is an important dynamic feature of the program which facilitates fast and dynamic drug lead generation against the West Nile and Dengue virus in way which is reflective of the constantly growing PubChem database. The program prints out the top drug leads after screening PubChem library which is over a billion compounds. The leads generated by the program are fed as programmatic inputs to an In Silico modelling package. The interaction of top drug lead compounds generated by the program and drug targets of West Nile and Dengue virus, was modelled in an automated way through programmatic commands. Thus our program ushers in a new age of automatic ease in the virtual drug screening and drug identification through programmatic data mining of chemical data libraries and drug lead generation through machine learning based AutoQSAR algorithm and an automated In Silico modelling run through the program to study the interaction between the drug lead compounds and the drug target protein of West Nile and Dengue virus
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.