In materials informatics, the representation of material structures is fundamentally important to obtain better prediction results. Molecular crystals can be represented by both molecular and crystal representations, but there has been no examination to determine which representation is the most effective for the materials informatics of molecular crystals. In this work, different representations for molecular crystals were compared in an exemplified task of band gap prediction. We demonstrated that the predictive ability using molecular graph outperformed those of molecular fingerprints and crystal graphs. This result motivated the screening of molecular big data from PubChem, and the inference suggested candidate molecules of organic semiconductors for photovoltaics and luminescence. The novelty of this work relies on the representation comparison of molecular crystals and the finding that molecular graph works better even though the property prediction of crystalline materials. This finding will enable to machine-learning-aided screening and design of functional molecular crystals.
In materials informatics, the representation of the material structure is fundamentally essential to obtain better prediction results, and graph representation has attracted much attention in recent years. Molecular crystals can be graphically represented in molecular and crystal representations, but the comparison of which representation is more effective has not been examined. In this study, we compared the prediction accuracy between molecular and crystal graphs for band gap prediction. The results showed that the prediction accuracies using crystal graphs were better than those using molecular graphs. While this result is not surprising, error analysis quantitatively evaluated that the crystal graph reduces the error 0.4 times with medium correlation than the molecular graph. The novelty of this study lies in the comparison of molecular crystal representations and in the quantitative evaluation of the contribution of intermolecular interactions in the band gap.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.