Accurately predicting protein-ligand binding affinities is an important problem in computational chemistry since it can substantially accelerate drug discovery for virtual screening and lead optimization. We propose here a fast machine-learning approach for predicting binding affinities using state-of-the-art 3D-convolutional neural networks and compare this approach to other machine-learning and scoring methods using several diverse data sets. The results for the standard PDBbind (v.2016) core test-set are state-of-the-art with a Pearson's correlation coefficient of 0.82 and a RMSE of 1.27 in pK units between experimental and predicted affinity, but accuracy is still very sensitive to the specific protein used. K is made available via PlayMolecule.org for users to test easily their own protein-ligand complexes, with each prediction taking a fraction of a second. We believe that the speed, performance, and ease of use of K makes it already an attractive scoring function for modern computational chemistry pipelines.
Despite the many approaches to study differential splicing from RNA-seq, many challenges remain unsolved, including computing capacity and sequencing depth requirements. Here we present SUPPA2, a new method that addresses these challenges, and enables streamlined analysis across multiple conditions taking into account biological variability. Using experimental and simulated data, we show that SUPPA2 achieves higher accuracy compared to other methods, especially at low sequencing depth and short read length. We use SUPPA2 to identify novel Transformer2-regulated exons, novel microexons induced during differentiation of bipolar neurons, and novel intron retention events during erythroblast differentiation.Electronic supplementary materialThe online version of this article (10.1186/s13059-018-1417-1) contains supplementary material, which is available to authorized users.
In
this work, we propose a machine learning approach to generate
novel molecules starting from a seed compound, its three-dimensional
(3D) shape, and its pharmacophoric features. The pipeline draws inspiration
from generative models used in image analysis and represents a first
example of the de novo design of lead-like molecules guided by shape-based
features. A variational autoencoder is used to perturb the 3D representation
of a compound, followed by a system of convolutional and recurrent
neural networks that generate a sequence of SMILES tokens. The generative
design of novel scaffolds and functional groups can cover unexplored
regions of chemical space that still possess lead-like properties.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.