As
semiconductor devices are miniaturized, the importance of atomic
layer deposition (ALD) technology is growing. When designing ALD precursors,
it is important to consider the melting point, because the precursors
should have melting points lower than the process temperature. However,
obtaining melting point data is challenging due to experimental sensitivity
and high computational costs. As a result, a comprehensive and well-organized
database for the melting point of the OMCs has not been fully reported
yet. Therefore, in this study, we constructed a database of melting
points for 1,845 OMCs, including 58 metal and 6 metalloid elements.
The database contains CAS numbers, molecular formulas, and structural
information and was constructed through automatic extraction and systematic
curation. The melting point information was extracted using two methods:
1) 1,434 materials from 11 chemical vendor databases and 2) 411 materials
identified through natural language processing (NLP) techniques with
an accuracy of 86.3%, based on 2,096 scientific papers published over
the past 29 years. In our database, the OMCs contain up to around
250 atoms and have melting points that range from −170 to
1610 °C. The main source is the Chemsrc database, accounting
for 607 materials (32.9%), and Fe is the most common central metal
or metalloid element (15.0%), followed by Si (11.6%) and B (6.7%).
To validate the utilization of the constructed database, a multimodal
neural network model was developed integrating graph-based and feature-based
information as descriptors to predict the melting points of the OMCs
but moderate performance. We believe the current approach reduces
the time and cost associated with hand-operated data collection and
processing, contributing to effective screening of potentially promising
ALD precursors and providing crucial information for the advancement
of the semiconductor industry.