2022
DOI: 10.3390/molecules27072331
|View full text |Cite
|
Sign up to set email alerts
|

Introducing a Chemically Intuitive Core-Substituent Fingerprint Designed to Explore Structural Requirements for Effective Similarity Searching and Machine Learning

Abstract: Fingerprint (FP) representations of chemical structure continue to be one of the most widely used types of molecular descriptors in chemoinformatics and computational medicinal chemistry. One often distinguishes between two- and three-dimensional (2D and 3D) FPs depending on whether they are derived from molecular graphs or conformations, respectively. Primary application areas for FPs include similarity searching and compound classification via machine learning, especially for hit identification. For these ap… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 8 publications
(10 citation statements)
references
References 42 publications
0
10
0
Order By: Relevance
“…Each structural fragment was assigned to a single bit position. For further details, the interested reader is referred to the original work …”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Each structural fragment was assigned to a single bit position. For further details, the interested reader is referred to the original work …”
Section: Methodsmentioning
confidence: 99%
“…For further details, the interested reader is referred to the original work. 21 Machine Learning Models. Classification models were built by using the RF algorithm, with a default number of trees (n_estimator) equal to 400, 53 implemented using the scikitlearn python package.…”
Section: ■ Experimental Sectionmentioning
confidence: 99%
See 1 more Smart Citation
“…47 Here, a recently proposed selective FP composed of 1000 bits, better known as CSFP, was used. 50 In particular, CSFP 63 represents only 2D structures, thus limiting the representation redundancy. 50 To further reduce dimension, 420 identical null bits were removed when generating the CSFP from our pool of Dev Tox data.…”
Section: ■ Materialsmentioning
confidence: 99%
“…The latter results from special pruning methods or removal processes such as FRED / S-KEYS (fast random removal of descriptor/substructure keys). The newly announced hologram-QSAR (H-QSAR) technology relies on calculating the occurrence of substructured pathways for specific functional groups 57,58 . It no longer relies on a predefined list of substructure motifs 59,60 .…”
Section: Fragment-based Descriptors and Molecule Fingerprintsmentioning
confidence: 99%