Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the geometric and biological complexity. To address this problem we introduce the element-specific persistent homology (ESPH) method. ESPH represents 3D complex geometry by one-dimensional (1D) topological invariants and retains important biological information via a multichannel image-like representation. This representation reveals hidden structure-function relationships in biomolecules. We further integrate ESPH and deep convolutional neural networks to construct a multichannel topological neural network (TopologyNet) for the predictions of protein-ligand binding affinities and protein stability changes upon mutation. To overcome the deep learning limitations from small and noisy training sets, we propose a multi-task multichannel topological convolutional neural network (MM-TCNN). We demonstrate that TopologyNet outperforms the latest methods in the prediction of protein-ligand binding affinities, mutation induced globular protein folding free energy changes, and mutation induced membrane protein folding free energy changes. Availability: weilab.math.msu.edu/TDL/
This work introduces a number of algebraic topology approaches, including multi-component persistent homology, multi-level persistent homology, and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. In contrast to the conventional persistent homology, multi-component persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for protein-ligand binding analysis and virtual screening of small molecules. Extensive numerical experiments involving 4,414 protein-ligand complexes from the PDBBind database and 128,374 ligand-target and decoy-target pairs in the DUD database are performed to test respectively the scoring power and the discriminatory power of the proposed topological learning strategies. It is demonstrated that the present topological learning outperforms other existing methods in protein-ligand binding affinity prediction and ligand-decoy discrimination.
This paper presents a novel method for solving the Poisson-Boltzmann (PB) equation based on a rigorous treatment of geometric singularities of the dielectric interface and a Green's function formulation of charge singularities. Geometric singularities, such as cusps and self-intersecting surfaces, in the dielectric interfaces are bottleneck in developing highly accurate PB solvers. Based on an advanced mathematical technique, the matched interface and boundary (MIB) method, we have recently developed a PB solver by rigorously enforcing the flux continuity conditions at the solvent-molecule interface where geometric singularities may occur. The resulting PB solver, denoted as MIBPB-II, is able to deliver second order accuracy for the molecular surfaces of proteins. However, when the mesh size approaches half of the van der Waals radius, the MIBPB-II cannot maintain its accuracy because the grid points that carry the interface information overlap with those that carry distributed singular charges. In the present Green's function formalism, the charge singularities are transformed into interface flux jump conditions, which are treated on an equal footing as the geometric singularities in our MIB framework. The resulting method, denoted as MIBPB-III, is able to provide highly accurate electrostatic potentials at a mesh as coarse as 1.2 A for proteins. Consequently, at a given level of accuracy, the MIBPB-III is about three times faster than the APBS, a recent multigrid PB solver. The MIBPB-III has been extensively validated by using analytically solvable problems, molecular surfaces of polyatomic systems, and 24 proteins. It provides reliable benchmark numerical solutions for the PB equation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.