2022
DOI: 10.1002/wcms.1603
|View full text |Cite
|
Sign up to set email alerts
|

A review of molecular representation in the age of machine learning

Abstract: Research in chemistry increasingly requires interdisciplinary work prompted by, among other things, advances in computing, machine learning, and artificial intelligence. Everyone working with molecules, whether chemist or not, needs an understanding of the representation of molecules in a machinereadable format, as this is central to computational chemistry. Four classes of representations are introduced: string, connection table, feature-based, and computer-learned representations. Three of the most significa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
93
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
3

Relationship

1
9

Authors

Journals

citations
Cited by 152 publications
(93 citation statements)
references
References 93 publications
0
93
0
Order By: Relevance
“…In the field of molecular discovery based on machine learning, various methods of molecular fingerprint appropriately exhibiting structural and chemical properties of the given molecular structure have been recently proposed. 47,[54][55][56][57] As a fingerprint corresponding to the charge dynamics of OLEDs, we selected the modulus spectra of OLEDs depending on the DC voltage and AC frequency. In the case of conventional current density-voltage property, the information of the transit time of each organic layer, charge retardation, and charge accumulation were not separately extracted since these mechanisms commonly result in the change of resistance in devices.…”
Section: Resultsmentioning
confidence: 99%
“…In the field of molecular discovery based on machine learning, various methods of molecular fingerprint appropriately exhibiting structural and chemical properties of the given molecular structure have been recently proposed. 47,[54][55][56][57] As a fingerprint corresponding to the charge dynamics of OLEDs, we selected the modulus spectra of OLEDs depending on the DC voltage and AC frequency. In the case of conventional current density-voltage property, the information of the transit time of each organic layer, charge retardation, and charge accumulation were not separately extracted since these mechanisms commonly result in the change of resistance in devices.…”
Section: Resultsmentioning
confidence: 99%
“…Representing chemical compounds in a machine-readable format is considered a challenge in chemoinformatics due to its effect on different surrogate models and, thus, their predictive performance. 39 Previous literature has led to ambiguous outcomes on whether the addition of chemical information within low data regimes, such as the initialization of active-ML search strategies, is beneficial. 31,40 To learn more about featurization effects on this specific application, we compared two input feature sets.…”
Section: Featurization Effectsmentioning
confidence: 99%
“…[70] Structural representation is the most relevant feature in basically any chemoinformatics application, and computational study [73] and it is an area under constant research. [74] In virtual screening, defining the chemical space to be explored is crucial, as it defines the applicability domain that will be searched. In practice, it is common to conduct virtual screening campaigns focused on regions of the medicinally relevant chemical space.…”
Section: Chemical Space and Chemical Multiversementioning
confidence: 99%