2011
DOI: 10.1007/s10822-011-9429-x
|View full text |Cite
|
Sign up to set email alerts
|

Drug discovery using very large numbers of patents. General strategy with extensive use of match and edit operations

Abstract: A patent data base of 6.7 million compounds generated by a very high performance computer (Blue Gene) requires new techniques for exploitation when extensive use of chemical similarity is involved. Such exploitation includes the taxonomic classification of chemical themes, and data mining to assess mutual information between themes and companies. Importantly, we also launch candidates that evolve by "natural selection" as failure of partial match against the patent data base and their ability to bind to the pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
45
0

Year Published

2013
2013
2024
2024

Publication Types

Select...
6

Relationship

2
4

Authors

Journals

citations
Cited by 21 publications
(49 citation statements)
references
References 31 publications
4
45
0
Order By: Relevance
“…In a 2011 study [17], the author and colleagues applied somewhat unusual molecule mining techniques and drew the conclusion that "chemical similarity and novelty are human concepts that largely have meaning by utility in specific contexts. For some purposes, mutual information involving chemical themes might be a better concept".…”
Section: Scope and Utilitymentioning
confidence: 99%
See 4 more Smart Citations
“…In a 2011 study [17], the author and colleagues applied somewhat unusual molecule mining techniques and drew the conclusion that "chemical similarity and novelty are human concepts that largely have meaning by utility in specific contexts. For some purposes, mutual information involving chemical themes might be a better concept".…”
Section: Scope and Utilitymentioning
confidence: 99%
“…When undertaking the previous study [17], applying rigorous substructure similarity tests to very large collections of SMILES strings, and especially for similarity between many members of the collection rather than a single query, can be a rate limiting step for larger molecules in a workflow. We originally sought to speed this by first using a highly optimized industry standard for pattern matching, the regular expression ("regex") [20] in the context of the Perl language [21].…”
Section: Experiences Of Utility Of Similarity Testing Algorithmsmentioning
confidence: 99%
See 3 more Smart Citations