2015
DOI: 10.1093/nar/gkv1253
|View full text |Cite
|
Sign up to set email alerts
|

SureChEMBL: a large-scale, chemically annotated patent document database

Abstract: SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
164
0
1

Year Published

2015
2015
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 185 publications
(176 citation statements)
references
References 26 publications
0
164
0
1
Order By: Relevance
“…PubChem currently offers links between about 6 million patent documents and more than 16 million unique chemical structures, with over 336 million chemical substance-patent links covering U.S., European, and World Intellectual Property Organization (WIPO) patent documents published since 1800. This information is contributed by various organizations, including IBM, 67 SureChEMBL (formerly known as SureChem), 68,69 NextMove Software, 70 SCRIPDB, 71 and BindingDB. 57 …”
Section: An Overview Of Pubchem As a Resource For Virtual Screeningmentioning
confidence: 99%
“…PubChem currently offers links between about 6 million patent documents and more than 16 million unique chemical structures, with over 336 million chemical substance-patent links covering U.S., European, and World Intellectual Property Organization (WIPO) patent documents published since 1800. This information is contributed by various organizations, including IBM, 67 SureChEMBL (formerly known as SureChem), 68,69 NextMove Software, 70 SCRIPDB, 71 and BindingDB. 57 …”
Section: An Overview Of Pubchem As a Resource For Virtual Screeningmentioning
confidence: 99%
“…Their characteristics, including number of rows (n), columns (m), nonzeros (nnz), and mean row/column length (μ r /μ c ), are detailed in [40] database, which includes a large set of chemical compounds automatically extracted from text, images, and attachments of patent documents. SC-5M, SC-1M, SC-500K, and SC-100K are random subsets of 5E+6, 1E+6, 5E+5, and 1E+5 compounds, respectively, from the SC-11.5M dataset.…”
Section: Datasetsmentioning
confidence: 99%
“…[1][2][3][4][5][6] application providing an adequate user interface. In previous work, we have shown the viability of this approach by making the generated mappings of a variety of databases available through desktop and web applications.…”
Section: Introductionmentioning
confidence: 99%