Identification of B-cell epitopes (BCEs) is a fundamental step for epitope-based vaccine development, antibody production, and disease prevention and diagnosis. Due to the avalanche of protein sequence data discovered in postgenomic age, it is essential to develop an automated computational method to enable fast and accurate identification of novel BCEs within vast number of candidate proteins and peptides. Although several computational methods have been developed, their accuracy is unreliable. Thus, developing a reliable model with significant prediction improvements is highly desirable. In this study, we first constructed a non-redundant data set of 5,550 experimentally validated BCEs and 6,893 non-BCEs from the Immune Epitope Database. We then developed a novel ensemble learning framework for improved linear BCE predictor called iBCE-EL, a fusion of two independent predictors, namely, extremely randomized tree (ERT) and gradient boosting (GB) classifiers, which, respectively, uses a combination of physicochemical properties (PCP) and amino acid composition and a combination of dipeptide and PCP as input features. Cross-validation analysis on a benchmarking data set showed that iBCE-EL performed better than individual classifiers (ERT and GB), with a Matthews correlation coefficient (MCC) of 0.454. Furthermore, we evaluated the performance of iBCE-EL on the independent data set. Results show that iBCE-EL significantly outperformed the state-of-the-art method with an MCC of 0.463. To the best of our knowledge, iBCE-EL is the first ensemble method for linear BCEs prediction. iBCE-EL was implemented in a web-based platform, which is available at . iBCE-EL contains two prediction modes. The first one identifying peptide sequences as BCEs or non-BCEs, while later one is aimed at providing users with the option of mining potential BCEs from protein sequences.
Comprehensive characterization of ligand-binding sites is invaluable to infer molecular functions of hypothetical proteins, trace evolutionary relationships between proteins, engineer enzymes to achieve a desired substrate specificity, and develop drugs with improved selectivity profiles. These research efforts pose significant challenges owing to the fact that similar pockets are commonly observed across different folds, leading to the high degree of promiscuity of ligand-protein interactions at the system-level. On that account, novel algorithms to accurately classify binding sites are needed. Deep learning is attracting a significant attention due to its successful applications in a wide range of disciplines. In this communication, we present DeepDrug3D, a new approach to characterize and classify binding pockets in proteins with deep learning. It employs a state-of-the-art convolutional neural network in which biomolecular structures are represented as voxels assigned interaction energy-based attributes. The current implementation of DeepDrug3D, trained to detect and classify nucleotide- and heme-binding sites, not only achieves a high accuracy of 95%, but also has the ability to generalize to unseen data as demonstrated for steroid-binding proteins and peptidase enzymes. Interestingly, the analysis of strongly discriminative regions of binding pockets reveals that this high classification accuracy arises from learning the patterns of specific molecular interactions, such as hydrogen bonds, aromatic and hydrophobic contacts. DeepDrug3D is available as an open-source program at https://github.com/pulimeng/DeepDrug3D with the accompanying TOUGH-C1 benchmarking dataset accessible from https://osf.io/enz69/ .
Toll-like receptors (TLRs) are pattern recognition receptors that recognize pathogens based on distinct molecular signatures. The human (h)TLR1, 2, 6 and 10 belong to the hTLR1 subfamilies, which are localized in the extracellular regions and activated in response to diverse ligand molecules. Due to the unavailability of the hTLR10 crystal structure, the understanding of its homo and heterodimerization with hTLR2 and hTLR1 and the ligand responsible for its activation is limited. To improve our understanding of the TLR10 receptor-ligand interaction, we used homology modeling to construct a three dimensional (3D) structure of hTLR10 and refined the model through molecular dynamics (MD) simulations. We utilized the optimized structures for the molecular docking in order to identify the potential site of interactions between the homo and heterodimer (hTLR10/2 and hTLR10/1). The docked complexes were then used for interaction with ligands (Pam3CSK4 and PamCysPamSK4) using MOE-Dock and ASEDock. Our docking studies have shown the binding orientations of hTLR10 heterodimer to be similar with other TLR2 family members. However, the binding orientation of hTLR10 homodimer is different from the heterodimer due to the presence of negative charged surfaces at the LRR11-14, thereby providing a specific cavity for ligand binding. Moreover, the multiple protein-ligand docking approach revealed that Pam3CSK4 might be the ligand for the hTLR10/2 complex and PamCysPamSK4, a di-acylated peptide, might activate hTLR10/1 hetero and hTLR10 homodimer. Therefore, the current modeled complexes can be a useful tool for further experimental studies on TLR biology.
BackgroundDetecting similar ligand-binding sites in globally unrelated proteins has a wide range of applications in modern drug discovery, including drug repurposing, the prediction of side effects, and drug-target interactions. Although a number of techniques to compare binding pockets have been developed, this problem still poses significant challenges.ResultsWe evaluate the performance of three algorithms to calculate similarities between ligand-binding sites, APoc, SiteEngine, and G-LoSA. Our assessment considers not only the capabilities to identify similar pockets and to construct accurate local alignments, but also the dependence of these alignments on the sequence order. We point out certain drawbacks of previously compiled datasets, such as the inclusion of structurally similar proteins, leading to an overestimated performance. To address these issues, a rigorous procedure to prepare unbiased, high-quality benchmarking sets is proposed. Further, we conduct a comparative assessment of techniques directly aligning binding pockets to indirect strategies employing structure-based virtual screening with AutoDock Vina and rDock.ConclusionsThorough benchmarks reveal that G-LoSA offers a fairly robust overall performance, whereas the accuracy of APoc and SiteEngine is satisfactory only against easy datasets. Moreover, combining various algorithms into a meta-predictor improves the performance of existing methods to detect similar binding sites in unrelated proteins by 5–10%. All data reported in this paper are freely available at https://osf.io/6ngbs/.
Toll-like receptors (TLRs) play a central role in the innate immune response by recognizing conserved structural patterns in a variety of microbes. TLRs are classified into six families, of which TLR7 family members include TLR7, 8, and 9, which are localized to endolysosomal compartments recognizing viral infection in the form of foreign nucleic acids. In our current study, we focused on TLR8, which has been shown to recognize different types of ligands such as viral or bacterial ssRNA as well as small synthetic molecules. The primary sequences of rodent and non-rodent TLR8s are similar, but the antiviral compound (R848) that activates the TLR8 pathway is species-specific. Moreover, the factors underlying the receptor's species-specificity remain unknown. To this end, comparative homology modeling, molecular dynamics simulations refinement, automated docking and computational mutagenesis studies were employed to probe the intermolecular interactions between this anti-viral compound and TLR8. Furthermore, comparative analyses of modeled TLR8 (rodent and non-rodent) structures have shown that the variation mainly occurs at LRR14-15 (undefined region); hence, we hypothesized that this variation may be the primary reason for the exhibited species-specificity. Our hypothesis was further bolstered by our docking studies, which clearly showed that this undefined region was in close proximity to the ligand-binding site and thus may play a key role in ligand recognition. In addition, the interface between the ligand and TLR8s varied depending upon the amino acid charges, free energy of binding, and interaction surface. Therefore, our current work provides a hypothesis for previous in vivo studies in the context of TLR signaling.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.