2010
DOI: 10.1002/cmdc.201000089
|View full text |Cite
|
Sign up to set email alerts
|

Rendering Conventional Molecular Fingerprints for Virtual Screening Independent of Molecular Complexity and Size Effects

Abstract: Molecular complexity and size effects represent a known complication of fingerprint similarity searching and virtual screening that often leads to an increase in false-positive rates and a decrease in hit rates. In standard fingerprints, differences in the complexity of reference and database molecules lead to different fingerprint bit densities, which negatively affects similarity search calculations, in particular, when fingerprints of reference molecules have higher bit density than corresponding fingerprin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2010
2010
2021
2021

Publication Types

Select...
5
2
1
1

Relationship

2
7

Authors

Journals

citations
Cited by 12 publications
(10 citation statements)
references
References 27 publications
0
10
0
Order By: Relevance
“…36 Molecular complexity effects can be balanced or eliminated in different ways, for example, by equally taking into account bits that are set on or off in similarity calculations 37,38 or by combining binary fingerprint representations with their complements, i.e., adding the complement to the original bit string, thereby producing a constant fingerprint bit density for compounds of any size. 39 Calculating Tanimoto, Tversky, or Dice similarity has an assumed advantage that numerical values can now be used to distinguish similarity relationships in a consistent manner. How does this numerical approach from chemical informatics relate to, and perhaps influence, the more subjective assessment of similarity in medicinal chemistry?…”
Section: Journal Of Medicinal Chemistrymentioning
confidence: 99%
“…36 Molecular complexity effects can be balanced or eliminated in different ways, for example, by equally taking into account bits that are set on or off in similarity calculations 37,38 or by combining binary fingerprint representations with their complements, i.e., adding the complement to the original bit string, thereby producing a constant fingerprint bit density for compounds of any size. 39 Calculating Tanimoto, Tversky, or Dice similarity has an assumed advantage that numerical values can now be used to distinguish similarity relationships in a consistent manner. How does this numerical approach from chemical informatics relate to, and perhaps influence, the more subjective assessment of similarity in medicinal chemistry?…”
Section: Journal Of Medicinal Chemistrymentioning
confidence: 99%
“…Its particular importance commonly comes in circumstances where one has a "hit" in a bioassay and wishes to select from a library of available molecules of known structure which ones to prioritize for further assays that might detect a more potent hit. The usual means of assessing molecular similarity are based on encoding the molecules as vectors of numbers based either on a list of measured or calculated biophysical or structural properties, or via the use of so-called molecular fingerprinting methods (e.g., [135][136][137][138][139][140][141][142]). We ourselves have used a variety of these methods in comparing the "similarity" between marketed drugs, endogenous metabolites and vitamins, natural products, and certain fluorophores [91,99,113,[143][144][145][146][147][148].…”
Section: Discussionmentioning
confidence: 99%
“…Thus, similarity search calculations with PDR-FP are not biased by differences in molecular complexity [78] and have been shown to be particularly effective on compound classes having high structural diversity where other types of fingerprints produce only low compound recall [32,80]. Furthermore, it has recently also been demonstrated that conventional keyed fingerprints can be rendered complexity-independent through 'balanced codes' transformation, that is, by merging a fingerprint with the complement of its bit setting, which generates a constant bit density of 50% (but doubles the size of the fingerprint) [81]. Fligner et al [82] introduced a modified version of the Tc that was able to reduce the compound size bias in the selection of diverse subsets from source libraries.…”
Section: Circumventing Intrinsic Limitations: Complexity Effectsmentioning
confidence: 99%