2010
DOI: 10.1007/s10822-010-9346-4
|View full text |Cite
|
Sign up to set email alerts
|

Tautomerism in large databases

Abstract: We have used the Chemical Structure DataBase (CSDB) of the NCI CADD Group, an aggregated collection of over 150 small-molecule databases totaling 103.5 million structure records, to conduct tautomerism analyses on one of the largest currently existing sets of real (i.e. not computer-generated) compounds. This analysis was carried out using calculable chemical structure identifiers developed by the NCI CADD Group, based on hash codes available in the chemoinformatics toolkit CACTVS and a newly developed scoring… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

7
110
0

Year Published

2010
2010
2018
2018

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 70 publications
(117 citation statements)
references
References 20 publications
7
110
0
Order By: Relevance
“…These substructure indicators are henceforward referred to as Fingerprint Features (FFs). To address tautomeric conversion, a phenomenon which affects the encoding, compounds were normalized prior to the encoding procedure A c c e p t e d M a n u s c r i p t 5 (Sitzmann et al, 2010). The on-target bioactivity is measured through a dose-response bioassay, resulting in an IC50 value for each compound.…”
Section: Downloaded By [Nanyang Technological University] At 13:42 24mentioning
confidence: 99%
“…These substructure indicators are henceforward referred to as Fingerprint Features (FFs). To address tautomeric conversion, a phenomenon which affects the encoding, compounds were normalized prior to the encoding procedure A c c e p t e d M a n u s c r i p t 5 (Sitzmann et al, 2010). The on-target bioactivity is measured through a dose-response bioassay, resulting in an IC50 value for each compound.…”
Section: Downloaded By [Nanyang Technological University] At 13:42 24mentioning
confidence: 99%
“…Structure normalization is performed for any incoming structure set to be registered, or searched by, in CSLS. Each parent structure is then subjected to a hashcode calculation to generate the NCI/CADD identifier [28].…”
Section: Chemical Structure Representationmentioning
confidence: 99%
“…Currently there are eight identifier variants defined for a structure: FICTS, FICTu, FICuS, FICuu, uuuTS, uuuTu, uuuuS, and uuuuu. Three of them, FICTS, FICuS and uuuuu are searchable for all the structure records in CSLS [28].…”
Section: Chemical Structure Representationmentioning
confidence: 99%
“…However, it is possible that such rules for the enumeration of tautomers may be too aggressive and not realistic from an organic chemists’ viewpoint, 17 i.e. they may declare structures to be tautomers which in reality have a high energy barrier for interconversion and can be isolated as different, stable compounds.…”
Section: Introductionmentioning
confidence: 99%