Tautomerism in large databases

Sitzmann, Markus; Ihlenfeldt, Wolf‐Dietrich; Nicklaus, Marc C.

doi:10.1007/s10822-010-9346-4

Cited by 70 publications

(117 citation statements)

References 20 publications

Supporting

Mentioning

110

Contrasting

Order By: Relevance

“…These substructure indicators are henceforward referred to as Fingerprint Features (FFs). To address tautomeric conversion, a phenomenon which affects the encoding, compounds were normalized prior to the encoding procedure A c c e p t e d M a n u s c r i p t 5 (Sitzmann et al, 2010). The on-target bioactivity is measured through a dose-response bioassay, resulting in an IC50 value for each compound.…”

Section: Downloaded By [Nanyang Technological University] At 13:42 24mentioning

confidence: 99%

Principal bicorrelation analysis: Unraveling associations between three data sources

Mattiello

Thas

Verbist

2015

Journal of Biopharmaceutical Statistics

View full text Add to dashboard Cite

In this article, we propose a statistical explorative method for data integration. It is developed in the context of early drug development for which it enables the detection of chemical substructures and the identification of genes that mediate their association with the bioactivity (BA). The core of the method is a sparse singular value decomposition for the identification of the gene set and a permutation-based method for the control of the false discovery rate. The method is illustrated using a real dataset, and its properties are empirically evaluated by means of a simulation study. Quantitative Structure Transcriptional Activity Relationship (QSTAR, www.qstar-consortium.org ) is a new paradigm in early drug development that extends QSAR by not only considering data on the chemical structure of the compounds and on the compound-induced BA, but by simultaneously using transcriptomics data (gene expression). This approach enables, for example, the detection of chemical substructures that are associated with BA, while at the same time a gene set is correlated with both these substructures and the BA. Although causal associations cannot be formally concluded, these associations may suggest that the compounds act on the BA through a particular genomic pathway.

show abstract

Section: Downloaded By [Nanyang Technological University] At 13:42 24mentioning

confidence: 99%

Principal bicorrelation analysis: Unraveling associations between three data sources

Mattiello

Thas

Verbist

2015

Journal of Biopharmaceutical Statistics

View full text Add to dashboard Cite

show abstract

“…Structure normalization is performed for any incoming structure set to be registered, or searched by, in CSLS. Each parent structure is then subjected to a hashcode calculation to generate the NCI/CADD identifier [28].…”

Section: Chemical Structure Representationmentioning

confidence: 99%

“…Currently there are eight identifier variants defined for a structure: FICTS, FICTu, FICuS, FICuu, uuuTS, uuuTu, uuuuS, and uuuuu. Three of them, FICTS, FICuS and uuuuu are searchable for all the structure records in CSLS [28].…”

Section: Chemical Structure Representationmentioning

confidence: 99%

Tautomerism in chemical information management systems

Warr¹

2010

J Comput Aided Mol Des

View full text Add to dashboard Cite

Tautomerism has an impact on many of the processes in chemical information management systems including novelty checking during registration into chemical structure databases; storage of structures; exact and substructure searching in chemical structure databases; and depiction of structures retrieved by a search. The approaches taken by 27 different software vendors and database producers are compared. It is hoped that this comparison will act as a discussion document that could ultimately improve databases and software for researchers in the future.

show abstract

“…However, it is possible that such rules for the enumeration of tautomers may be too aggressive and not realistic from an organic chemists’ viewpoint, 17 i.e. they may declare structures to be tautomers which in reality have a high energy barrier for interconversion and can be isolated as different, stable compounds.…”

Section: Introductionmentioning

confidence: 99%

Experimental and Chemoinformatics Study of Tautomerism in a Database of Commercially Available Screening Samples

Guasch

Yapamudiyansel

Peach

et al. 2016

J. Chem. Inf. Model.

Self Cite

View full text Add to dashboard Cite

We investigated how many cases of the same chemical sold as different products (at possibly different prices) occurred in a prototypical large aggregated database and simultaneously tested the tautomerism definitions in the chemoinformatics toolkit CACTVS. We applied the standard CACTVS tautomeric transforms plus a set of recently developed ring–chain transforms to the Aldrich Market Select (AMS) database of 6 million screening samples and building blocks. In 30 000 cases, two or more AMS products were found to be just different tautomeric forms of the same compound. We purchased and analyzed 166 such tautomer pairs and triplets by 1H and 13C NMR to determine whether the CACTVS transforms accurately predicted what is the same “stuff in the bottle”. Essentially all prototropic transforms with examples in the AMS were confirmed. Some of the ring–chain transforms were found to be too “aggressive”, i.e. to equate structures with one another that were different compounds.

show abstract

Tautomerism in large databases

Cited by 70 publications

References 20 publications

Principal bicorrelation analysis: Unraveling associations between three data sources

Principal bicorrelation analysis: Unraveling associations between three data sources

Tautomerism in chemical information management systems

Experimental and Chemoinformatics Study of Tautomerism in a Database of Commercially Available Screening Samples

Contact Info

Product

Resources

About