2018
DOI: 10.1093/bioinformatics/bty757
|View full text |Cite
|
Sign up to set email alerts
|

Development of a protein–ligand extended connectivity (PLEC) fingerprint and its application for binding affinity predictions

Abstract: Supplementary data are available at Bioinformatics online.

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
96
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 122 publications
(96 citation statements)
references
References 24 publications
0
96
0
Order By: Relevance
“…Machine learning methods at varying levels of sophistication have long been considered in the context of structure-based virtual screening ( 29 , 31 , 32 , 39 54 ). The vast majority of such studies sought to train a regression model that would recapitulate the binding affinities of known complexes, and thus provide a natural and intuitive replacement for traditional scoring functions ( 29 , 31 , 32 , 39 47 , 50 , 51 , 53 ). The downside of such a strategy, however, is that the resulting models are not ever exposed to any inactive complexes in the course of training: This is especially important in the context of docked complexes arising from virtual screening, where most compounds in the library are presumably inactive.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Machine learning methods at varying levels of sophistication have long been considered in the context of structure-based virtual screening ( 29 , 31 , 32 , 39 54 ). The vast majority of such studies sought to train a regression model that would recapitulate the binding affinities of known complexes, and thus provide a natural and intuitive replacement for traditional scoring functions ( 29 , 31 , 32 , 39 47 , 50 , 51 , 53 ). The downside of such a strategy, however, is that the resulting models are not ever exposed to any inactive complexes in the course of training: This is especially important in the context of docked complexes arising from virtual screening, where most compounds in the library are presumably inactive.…”
Section: Resultsmentioning
confidence: 99%
“…To confirm that this decoy-generation strategy indeed led to a challenging classification problem, we applied some of the top reported scoring functions in the literature to distinguish between active and decoy complexes in the D-COID set. For all eight methods tested [nnscore ( 32 ), RF-Score v1 ( 31 ), RF-Score v2 ( 40 ), RF-Score v3 ( 29 ), PLEClinear ( 42 ), PLECnn ( 42 ), PLECrf ( 42 ), and RF-Score-VS ( 41 )], we found that the distribution of scores assigned to active complexes was strongly overlapping with those of the decoy complexes ( Fig. 1 B ), indicating that these models showed very little discriminatory power when applied to this set.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…For each contacting protein-ligand atom pair (i.e., distance less than 4.5 Å), the respective protein and ligand atoms are each expanded to circular fragments using ECFP2 and hashed together into the fingerprint. Similarly, ECFP is integrated in the protein-ligand extended connectivity (PLEC) fingerprint [87], where n different bond diameters (called "depth") for atoms from protein and ligand are used.…”
Section: Ifps Including Distance Bitsmentioning
confidence: 99%
“…Da et al [ 12 ] developed an IFP that relies on the atomic environment of both the protein and ligand interacting atoms to set the positions of a bit in the fingerprint, rather than relying on protein residues and predefined interactions, which has the advantage of implicitly encoding every possible type of interaction. This protocol was later reimplemented in Python by Wójcikowski et al [ 13 ], but other more classical Python-based IFP implementations exist [ 14 19 ]. In this paper, we introduce a new Python library, ProLIF, that overcomes several limitations encountered by these programs, namely working exclusively with the output of specific docking programs, not being compatible with the analysis of MD trajectories, being restricted to a specific kind of complex (usually protein–ligand complexes), depending on residue or atom type naming conventions, or not being extensible or configurable regarding interactions (Table 1 ).…”
Section: Introductionmentioning
confidence: 99%