2023
DOI: 10.1038/s41524-023-01055-y
|View full text |Cite
|
Sign up to set email alerts
|

Validating neural networks for spectroscopic classification on a universal synthetic dataset

Abstract: To aid the development of machine learning models for automated spectroscopic data classification, we created a universal synthetic dataset for the validation of their performance. The dataset mimics the characteristic appearance of experimental measurements from techniques such as X-ray diffraction, nuclear magnetic resonance, and Raman spectroscopy among others. We applied eight neural network architectures to classify artificial spectra, evaluating their ability to handle common experimental artifacts. Whil… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 12 publications
(5 citation statements)
references
References 28 publications
0
5
0
Order By: Relevance
“…Furthermore, they are often restricted to the classification of existing compounds and can hardly be generalized to compounds outside their databases. Although some recent studies succeeded in symmetry classification in a large chemical space, it is hardly feasible to utilize these prior models for phase identification because the phases/structures of all of the inorganic compounds are much more numerous than their space groups. To solve these issues, it is required that the model be extensible according to the need of the task. The ResNet-based structure-type prediction protocol introduced in this work is developed based on the above idea, using a carefully designed parameter, reliability value, to tackle the challenge.…”
Section: Discussionmentioning
confidence: 99%
“…Furthermore, they are often restricted to the classification of existing compounds and can hardly be generalized to compounds outside their databases. Although some recent studies succeeded in symmetry classification in a large chemical space, it is hardly feasible to utilize these prior models for phase identification because the phases/structures of all of the inorganic compounds are much more numerous than their space groups. To solve these issues, it is required that the model be extensible according to the need of the task. The ResNet-based structure-type prediction protocol introduced in this work is developed based on the above idea, using a carefully designed parameter, reliability value, to tackle the challenge.…”
Section: Discussionmentioning
confidence: 99%
“…However, we conducted a recent study that revealed the lack of sensitivity with respect to identifying minor peaks in patterns for established network structures. [20] The detection of multi-phase peaks is of great importance for this work, necessitating modifications to the network architecture to improve performance in minor peak identification. Although single-phase structure peaks regularly occur in the training data, multiphase peaks are inserted at random positions, making them outliers from the expected results.…”
Section: Methodsmentioning
confidence: 99%
“…Second, an appropriate network structure is required to handle the peculiarities of the diffraction patterns. For instance, we evaluated commonly used neural network structures for the analysis of XRD in a recent study and identified deficiencies in detecting minor peaks in the diffraction patterns, [20] which extend to the recognition of multi-phase samples in the material discovery data. Additionally, amorphous phases have not been considered in prior works, so modifications to the architecture of established networks are necessary to handle such components.…”
Section: Introductionmentioning
confidence: 99%
“…Synthetic data are also beginning to be used in the physical sciences: Aty et al [14] studied synthetic data as a pre-training task for determining experimental lipid phase behaviour from small-angle x-ray scattering patterns [14]; Anker et al [15] explored the use of synthetic data in interpreting inelastic neutron-scattering data [15]; Schuetzke et al [16] used synthetic data to mimic the characteristic appearance of experimental measurements for several spectroscopy methods [16].…”
Section: Related Workmentioning
confidence: 99%