2021
DOI: 10.3389/frma.2021.654438
|View full text |Cite
|
Sign up to set email alerts
|

ChEMU 2020: Natural Language Processing Methods Are Effective for Information Extraction From Chemical Patents

Abstract: Chemical patents represent a valuable source of information about new chemical compounds, which is critical to the drug discovery process. Automated information extraction over chemical patents is, however, a challenging task due to the large volume of existing patents and the complex linguistic properties of chemical patents. The Cheminformatics Elsevier Melbourne University (ChEMU) evaluation lab 2020, part of the Conference and Labs of the Evaluation Forum 2020 (CLEF2020), was introduced to support the deve… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
16
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 23 publications
(16 citation statements)
references
References 48 publications
0
16
0
Order By: Relevance
“…The ChEMU 2020 benchmark dataset 1 [He et al, 2021] contains snippets sampled from 170 English patents from the European Patent Office and the United States Patent and Trademark Office [He et al, 2020a[He et al, ,b, 2021. As shown in Fig.…”
Section: Benchmark Data For Chemical Entity Recognition -Chemu 2020 Datasetmentioning
confidence: 99%
“…The ChEMU 2020 benchmark dataset 1 [He et al, 2021] contains snippets sampled from 170 English patents from the European Patent Office and the United States Patent and Trademark Office [He et al, 2020a[He et al, ,b, 2021. As shown in Fig.…”
Section: Benchmark Data For Chemical Entity Recognition -Chemu 2020 Datasetmentioning
confidence: 99%
“…Here, we discuss the previous literature within this domain. He et al (2021) used a CRF-based model for NER and a rulebased system for EE. For NER, they developed the BANNER NER system (Leaman and Gonzalez, 2008) which uses lexical, syntactic, and contextual features in a CRF model.…”
Section: Related Workmentioning
confidence: 99%
“…Chemical patents are a significant source of information about novel chemicals and chemical reactions. New chemical compound discovery plays a vital role in the chemical and pharmaceutical industry, and chemical patents are the first venue this information is disclosed ( He et al, 2021 ). Unfortunately, there has been a rapid growth of chemical patents in recent years, and with the increasing volume, the manual cataloging of these chemicals and chemical reactions is become laborious and time-intensive, making it difficult for researchers to keep up with the current state of the art.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations