2021
DOI: 10.1186/s13321-021-00517-z
|View full text |Cite
|
Sign up to set email alerts
|

InChI version 1.06: now more than 99.99% reliable

Abstract: The software for the IUPAC Chemical Identifier, InChI, is extraordinarily reliable. It has been tested on large databases around the world, and has proved itself to be an essential tool in the handling and integration of large chemical databases. InChI version 1.05 was released in January 2017 and version 1.06 in December 2020. In this paper, we report on the current state of the InChI Software, the details of the improvements in the v1.06 release, and the results of a test of the InChI run on PubChem, a datab… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
32
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 42 publications
(32 citation statements)
references
References 15 publications
0
32
0
Order By: Relevance
“…The first section presents general compound information (common name, synonyms, molecular formula, accurate and average masses). The second section is related to structural data with common numerical molecule representations in MDL Molfile (Dalby et al, 1992 ), Canonical SMILES (O’Boyle, 2012 ), InChI and InChIKey formats (Goodman et al, 2021 ; Southan, 2013 ) and 2D and 3D molecule images. A “cross-reference identifiers” section includes four selected external chemical compounds bank references, with ChEBI (Hastings et al, 2016 ), PubChem (Kim et al, 2021 ), KEGG (Kanehisa et al, 2016 ), and HMDB (Wishart et al, 2018 ) and a modular system to add any Web hyperlinks from specific biological knowledge banks or metabolic networks databases.…”
Section: Resultsmentioning
confidence: 99%
“…The first section presents general compound information (common name, synonyms, molecular formula, accurate and average masses). The second section is related to structural data with common numerical molecule representations in MDL Molfile (Dalby et al, 1992 ), Canonical SMILES (O’Boyle, 2012 ), InChI and InChIKey formats (Goodman et al, 2021 ; Southan, 2013 ) and 2D and 3D molecule images. A “cross-reference identifiers” section includes four selected external chemical compounds bank references, with ChEBI (Hastings et al, 2016 ), PubChem (Kim et al, 2021 ), KEGG (Kanehisa et al, 2016 ), and HMDB (Wishart et al, 2018 ) and a modular system to add any Web hyperlinks from specific biological knowledge banks or metabolic networks databases.…”
Section: Resultsmentioning
confidence: 99%
“…However, this behavior might change in future versions. 60 In practice, it has been found that I n C h I performs worse than Smiles in ML-based applications, likely due to the above-mentioned reasons. 54 …”
Section: Modern Molecular String Representationsmentioning
confidence: 99%
“…In this section, we discuss the challenges and prospects of extending Selfies beyond organic chemistry. In contrast to organic molecules, 60 transition metal, lanthanide, actinide, and main-group metal compounds are difficult to handle with current digital molecular representations 28 due to special bonding situations and intricate 3D structures, combined with technical limitations that have evolved for historical reasons. Most problems trace back to (1) the assumption that bonding is localized and thus can be described with valence bond (VB) theory, (2) the non-explicit representation of terminal hydrogen atoms, which are added to the heavy (non-H) atoms based on rules derived from VB models in an approach called “implicit hydrogens,” and (3) the inability to describe stereochemistry that goes beyond the usual restrictions of organic chemistry, i.e., stereogenic carbon centers plus some cases of cis / trans isomerism in C=C double bonds and cumulenes.…”
Section: Beyond Organic Chemistry: Complicated Bondsmentioning
confidence: 99%
“…SciWalker integrates structure-based chemical ontologies with other ontologies resulting in improved associations and classifications of scientific content. Once identified, chemical entities are converted into SMILES and InChIKeys , which render the documents searchable by structure, substructure, or structural similarity.…”
Section: Searching Huge Librariesmentioning
confidence: 99%