2014
DOI: 10.1021/ci5002197
|View full text |Cite
|
Sign up to set email alerts
|

Markov Logic Networks for Optical Chemical Structure Recognition

Abstract: Optical chemical structure recognition is the problem of converting a bitmap image containing a chemical structure formula into a standard structured representation of the molecule. We introduce a novel approach to this problem based on the pipelined integration of pattern recognition techniques with probabilistic knowledge representation and reasoning. Basic entities and relations (such as textual elements, points, lines, etc.) are first extracted by a low-level processing module. A probabilistic reasoning en… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
19
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
6
1
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(19 citation statements)
references
References 31 publications
0
19
0
Order By: Relevance
“…Moreover, as the need for harvesting the large amounts of published data grows, the demand for methods for easily mining structures from papers and patent data is also growing. Optical Character Recognition (OCR) systems, relying on a variety of ML and probabilistic pattern recognition techniques, were created to translate 2D depictions of chemical structures to standard chemical representations [146][147][148]. Nonetheless, the development of OCR systems can be hindered by the images' resolutions, the computational interpretations of chemical abbreviations, and the nature of the image representation, which can be embedded in text, in figures containing multiple structures, or in reaction pathways, and can be represented as either a skeletal formula or a Markush structure.…”
Section: Graphical Representations For Molecules and Macromoleculesmentioning
confidence: 99%
“…Moreover, as the need for harvesting the large amounts of published data grows, the demand for methods for easily mining structures from papers and patent data is also growing. Optical Character Recognition (OCR) systems, relying on a variety of ML and probabilistic pattern recognition techniques, were created to translate 2D depictions of chemical structures to standard chemical representations [146][147][148]. Nonetheless, the development of OCR systems can be hindered by the images' resolutions, the computational interpretations of chemical abbreviations, and the nature of the image representation, which can be embedded in text, in figures containing multiple structures, or in reaction pathways, and can be represented as either a skeletal formula or a Markush structure.…”
Section: Graphical Representations For Molecules and Macromoleculesmentioning
confidence: 99%
“…MLOCSR [33] is an OCSR method that follows a pipelined design strategy which is a combination of low level and high-level processing. The workflow is divided into three modules.…”
Section: Markov Logic Network For Ocsrmentioning
confidence: 99%
“…Over the following decades, more complex systems were developed, often based on the principles of their predecessors. [18][19][20][21][22][23][24][25][26][27][28][29] OSRA was the first chemical structure recognition open-source software, allowing new programs to be developed by direct extension.…”
Section: Introductionmentioning
confidence: 99%
“…The majority of optical chemical structure recognition packages, including Kekulé, 15 IBM's OROCS, 16 CLiDE 17 and CLiDEPro, 22 ChemOCR, 21 OSRA, 23 ChemReader, 24 MolRec, 26 ChemEx, 27 MLOCSR, 28 and ChemSchematicResolver 29 rely on a rule-based workflow rather than a data-driven approach. These systems achieve various degrees of accuracy, with the recently developed ChemSchematicResolver reaching 83-100% precision on a range of datasets.…”
Section: Introductionmentioning
confidence: 99%