With the goal of improving data based materials design, it is shown that by a sequential design of experiment scheme the process of generating and learning from the data can be combined to discover the relevant sections of the parameter space. The application is the energy of grain boundaries as a function of their geometric degrees of freedom, calculated from a simple model, or via atomistic simulations. The challenge is to predict the deep cusps of the energy, which are located at irregular intervals of the geometric parameters. Existing sampling approaches either use large sets of datapoints or a priori knowledge of the cusps' positions. By contrast, the authors' technique can find unknown cusps automatically with a minimal amount of datapoints. Key point is a Kriging interpolator with Matérn kernel to estimate the energy function. Using the jackknife variance, the next point in the sequential design is a compromise between sampling the region of largest fluctuations and avoiding a clustering of datapoints. In this way, the cusps of the energy can be found within only a few iterations, and refined as desired. This approach will be advantageous for any application with strong, localized fluctuations in the values of the unknown function.
Many materials processes and properties depend on the anisotropy of the energy of grain boundaries, i.e. on the fact that this energy is a function of the five geometric degrees of freedom (DOF) of the grain boundaries. To access this parameter space in an efficient way and discover energy cusps in unexplored regions, a method was recently established, which combines atomistic simulations with statistical methods [1]. This sequential sampling technique is now extended in the spirit of an active learning algorithm by adding a criterion to decide when the sampling is advanced enough to stop. To this instance, two parameters to analyse the sampling results on the fly are introduced: the number of cusps, which correspond to the most interesting and important regions of the energy landscape, and the maximum change of energy between two sequential iterations. Monitoring these two quantities provides valuable insight into how the subspaces are energetically structured. The combination of both parameters provides the necessary information to evaluate the sampling of the 2D subspaces of grain boundary plane inclinations of even non-periodic, low angle grain boundaries. With a reasonable number of datapoints in the initial design, only a few sequential iterations already influence the accuracy of the sampling substantially and the new algorithm outperforms regular high-throughput sampling.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.