This is the author manuscript accepted for publication and has undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process, which may lead to differences between this version and the Version of Record. Please cite this article as
There is intense interest in uncovering design rules that govern the formation of various structural phases as a function of chemical composition in multi-principal element alloys (MPEAs). In this paper, we develop a machine learning (ML) approach built on the foundations of ensemble learning, post hoc model interpretability of black-box models, and clustering analysis to establish a quantitative relationship between the chemical composition and experimentally observed phases of MPEAs. The originality of our work stems from performing instance-level (or local) variable attribution analysis of ML predictions based on the breakdown method, and then identifying similar instances based on k-means clustering analysis of the breakdown results. We also complement the breakdown analysis with Ceteris Paribus profiles that showcase how the model response changes as a function of a single variable, when the values of all other variables are fixed. Results from local model interpretability analysis uncover key insights into variables that govern the formation of each phase. Our developed approach is generic, model-agnostic, and valuable to explain the insights learned by the black-box models. An interactive web application is developed to facilitate model sharing and accelerate the design of MPEAs with targeted properties.
Composition dependence of second harmonic generation, refractive index, extinction coefficient, and optical bandgap in 20 nm thick crystalline Hf1−xZrxO2 (0 ≤ x ≤ 1) thin films is reported. The refractive index exhibits a general increase with increasing ZrO2 content with all values within the range of 1.98–2.14 from 880 nm to 400 nm wavelengths. A composition dependence of the indirect optical bandgap is observed, decreasing from 5.81 eV for HfO2 to 5.17 eV for Hf0.4Zr0.6O2. The bandgap increases for compositions with x > 0.6, reaching 5.31 eV for Hf0.1Zr0.9O2. Second harmonic signals are measured for 880 nm incident light. The magnitude of the second harmonic signal scales with the magnitude of the remanant polarization in the composition series. Film compositions that display near zero remanent polarizations exhibit minimal second harmonic generation while those with maximum remanent polarization also display the largest second harmonic signal. The results are discussed in the context of ferroelectric phase assemblage in the hafnium zirconium oxide films and demonstrate a path toward a silicon-compatible integrated nonlinear optical material.
We demonstrate the capabilities of two model-agnostic local post-hoc model interpretability methods, namely breakDown (BD) and shapley (SHAP), to explain the predictions of a black-box classification learning model that establishes a quantitative relationship between chemical composition and multi-principal element alloys (MPEA) phase formation. We trained an ensemble of support vector machines using a dataset with 1,821 instances, 12 features with low pair-wise correlation, and seven phase labels. Feature contributions to the model prediction are computed by BD and SHAP for each composition. The resulting BD and SHAP transformed data are then used as inputs to identify similar composition groups using k-means clustering. Explanation-of-clusters by features reveal that the results from SHAP agree more closely with the literature. Visualization of compositions within a cluster using Ceteris-Paribus (CP) profile plots show the functional dependencies between the feature values and predicted response. Despite the differences between BD and SHAP in variable attribution, only minor changes were observed in the CP profile plots. Explanation-of-clusters by examples show that the clusters that share a common phase label contain similar compositions, which clarifies the similar-looking CP profile trends. Two plausible reasons are identified to describe this observation: (1) In the limits of a dataset with independent and non-interacting features, BD and SHAP show promise in recognizing MPEA composition clusters with similar phase labels. (2) There is more than one explanation for the MPEA phase formation rules with respect to the set of features considered in this work.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.