Environmental chemicals may affect endocrine systems through multiple mechanisms, one of which is via effects on aromatase (also known as CYP19A1), an enzyme critical for maintaining the normal balance of estrogens and androgens in the body. Therefore, rapid and efficient identification of aromatase-related endocrine disrupting chemicals (EDCs) is important for toxicology and environment risk assessment. In this study, on the basis of the Tox21 10K compound library, in silico classification models for predicting aromatase binders/nonbinders were constructed by machine learning methods. To improve the prediction ability of the models, a combined classifier (CC) strategy that combines different independent machine learning methods was adopted. Performances of the models were measured by test and external validation sets containing 1336 and 216 chemicals, respectively. The best model was obtained with the MACCS (Molecular Access System) fingerprint and CC method, which exhibited an accuracy of 0.84 for the test set and 0.91 for the external validation set. Additionally, several representative substructures for characterizing aromatase binders, such as ketone, lactone, and nitrogen-containing derivatives, were identified using information gain and substructure frequency analysis. Our study provided a systematic assessment of chemicals binding to aromatase. The built models can be helpful to rapidly identify potential EDCs targeting aromatase.
Structure-based prediction of sites of metabolism (SOMs) mediated by cytochrome P450s (CYPs) is of great interest in drug discovery and development. However, protein flexibility and active site water molecules remain a challenge for accurate SOM prediction. CYP2C19 is one of the major drug-metabolizing enzymes and has attracted considerable attention because of its polymorphism and capability of metabolizing ∼7% clinically used drugs. In this study, we systematically evaluated the effects of protein flexibility and active site water molecules on SOM prediction for CYP2C19 substrates. Multiple conformational sampling techniques including GOLD flexible residues sampling, molecular dynamics (MD) and tCONCOORD side-chain sampling were adopted for assessing the influence of protein flexibility on SOM prediction. The prediction accuracy could be significantly improved when protein flexibility was considered using the tCONCOORD sampling method, which indicated that the side-chain conformation was important for accurate prediction. However, the inclusion of the crystallographic or MD-derived water molecule(s) does not necessarily improve the prediction accuracy. Finally, a combination of docking results with SMARTCyp was found to be able to increase the SOM prediction accuracy.
Human cytochrome P450 3A4 (CYP3A4) is a major drug-metabolizing enzyme responsible for the metabolism of ∼50% of clinically used drugs and is often involved in drug-drug interactions. It exhibits atypical binding and kinetic behavior toward many ligands. Binding of ligands to CYP3A4 is a complex process. Recent studies from both crystallography and biochemistry suggested the existence of a peripheral ligand-binding site at the enzyme surface. However, the stability of the ligand bound at this peripheral site and the possibility of discovering new CYP3A4 ligands based on this site remain unclear. In this study, we employed a combination of molecular docking, multiparalleled molecular dynamics (MD) simulations, virtual screening, and experimental bioassay to investigate these issues. Our results revealed that the binding mode of progesterone (PGS), a substrate of CYP3A4, in the crystal structure was not stable and underwent a significant conformational change. Through Glide docking and MD refinement, it was found that PGS was able to stably bind at the peripheral site via contacts with Phe215, Phe219, Phe220, and Asp214. On the basis of the refined peripheral site, virtual screening was then performed against the Enamine database. A total of three compounds were finally found to have inhibitory activity against CYP3A4 in both human liver microsome and recombinant human CYP3A4 enzyme assays, one of which showed potent inhibitory activity with IC lower than 1 μM and two of which exhibited moderate inhibitory activity with IC values lower than 10 μM. The findings not only presented the dynamic behavior of PGS at the peripheral site but also demonstrated the first indication of discovering CYP3A4 inhibitors based on the peripheral site.
Cytochrome P450 2C19 (CYP2C19) is one of 57 drug metabolizing enzymes in humans and is responsible for the metabolism of ∼7-10% of drugs in clinical use. Recently omeprazole-based analogues were reported to be the potent inhibitors of CYP2C19 and have the potential to be used as the tool compounds for studying the substrate selectivity of CYP2C19. However, the binding modes of these compounds with CYP2C19 remain to be elucidated. In this study, a combination of molecular docking, molecular dynamics (MD), and MM/GBSA calculations was employed to systematically investigate the interactions between these compounds and CYP2C19. The binding modes of these analogues were analyzed in detail. The results indicated that the inclusion of explicit active site water molecules could improve binding energy prediction when the water molecules formed a hydrogen bonding network between the ligand and protein. We also found that the effect of active site water molecules on binding free energy prediction was dependent on the ligand binding modes. Our results unravel the interactions of these omeprazole-based analogues with CYP2C19 and might be helpful for the future design of potent CYP2C19 inhibitors with improved metabolic properties.
Predicting protein function is a longstanding challenge that has significant scientific implications. The success of amino acid sequence-based learning methods depends on the relationship between sequence, structure, and function. However, recent advances in AlphaFold have led to highly accurate protein structure data becoming more readily available, prompting a fundamental question: given sufficient experimental and predicted structures, should we use structure-based learning methods instead of sequence-based learning methods for predicting protein function, given the intuition that a protein's structure has a closer relationship to its function than its amino acid sequence? To answer this question, we explore several key factors that affect function prediction accuracy. Firstly, we learn protein representations using state-of-the-art graph neural networks (GNNs) and compare graph construction(GC) methods at the residue and atomic levels. Secondly, we investigate whether protein structures generated by AlphaFold are as effective as experimental structures for function prediction when protein graphs are used as input. Finally, we compare the accuracy of sequence-only, structure-only, and sequence-structure fusion-based learning methods for predicting protein function. Additionally, we make several observations, provide useful tips, and share code and datasets to encourage further research and enhance reproducibility.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.