Accelerated searches, made possible by machine learning techniques, are of growing interest in materials discovery. A suitable case involves the solution processing of components that ultimately form thin films of solar cell materials known as hybrid organic-inorganic perovskites (HOIPs). The number of molecular species that combine in solution to form these films constitutes an overwhelmingly large "compositional" space (at times, exceeding 500,000 possible combinations). Selecting a HOIP with desirable characteristics involves choosing different cations, halides, and solvent blends from a diverse palette of options. An unguided search by experimental investigations or molecular simulations is prohibitively expensive. In this work, we propose a Bayesian optimization method that uses an application-specific kernel to overcome challenges where data is scarce, and in which the search space is given by binary variables indicating whether a constituent is present or not. We demonstrate that the proposed approach identifies HOIPs with the targeted maximum intermolecular binding energy between HOIP salt and solvent at considerably lower cost than previous state-of-the-art Bayesian optimization methodology and at a fraction of the time (less than 10%) needed to complete an exhaustive search. We find an optimal composition within 15 ± 10 iterations in a HOIP compositional space containing 72 combinations, and within 31 ± 9 iterations when considering mixed halides (240 combinations). Exhaustive quantum mechanical simulations of all possible combinations were used to validate the optimal prediction from a Bayesian optimization approach. This paper demonstrates the potential of the Bayesian optimization methodology reported here for new materials discovery.
It is a long-standing problem to lower bound the performance of randomized greedy algorithms for maximum matching. Aronson, Dyer, Frieze and Suen [1] studied the modified randomized greedy (MRG) algorithm and proved that it approximates the maximum matching within a factor of at least 1 2 + 1/400, 000. They use heavy combinatorial methods in their analysis. We introduce a new technique we call Contrast Analysis, and show a 1 2 + 1/256 performance lower bound for the MRG algorithm. The technique seems to be useful not only for the MRG, but also for other related algorithms.
We study the activation process in undirected graphs known as bootstrap percolation: a vertex is active either if it belongs to a set of initially activated vertices or if at some point it had at least r active neighbors, for a threshold r that is identical for all vertices. A contagious set is a vertex set whose activation results with the entire graph being active. Let m(G, r) be the size of a smallest contagious set in a graph G on n vertices.We examine density conditions that ensure m(G, r) = r for all r ≥ 2. With respect to the minimum degree, we prove that such a smallest possible contagious set is guaranteed to exist if and only if G has minimum degree at least k−1 k · n. Moreover, we study the speed with which the activation spreads and provide tight upper bounds on the number of rounds it takes until all nodes are activated in such a graph.We also investigate what average degree asserts the existence of small contagious sets. For n ≥ k ≥ r, we denote by M (n, k, r) the maximum number of edges in an n-vertex graph G satisfying m(G, r) > k. We determine the precise value of M (n, k, 2) and M (n, k, k), assuming that n is sufficiently large compared to k.
Peptide sequence engineering can potentially deliver materials-selective binding capabilities, which would be highly attractive in numerous biotic and abiotic nanomaterials applications. However, the number of known materials-selective peptide sequences is small, and identification of new sequences is laborious and haphazard. Previous attempts have sought to use machine learning and other informatics approaches that rely on existing data sets to accelerate the discovery of materials-selective peptides, but too few materials-selective sequences are known to enable reliable prediction. Moreover, this knowledge base is expensive to expand. Here, we combine a comprehensive and integrated experimental and modeling effort and introduce a Bayesian Effective Search for Optimal Sequences (BESOS) approach to address this challenge. Through this combined approach, we significantly expand the data set of Au-selective peptide sequences and identify an additional Ag-selective peptide sequence. Analysis of the binding motifs for the Ag-binders offers a roadmap for future prediction with machine learning, which should guide identification of further Ag-selective sequences. These discoveries will enable wider and more versatile integration of Ag nanoparticles in biological platforms.
We develop a framework for warm-starting Bayesian optimization, that reduces the solution time required to solve an optimization problem that is one in a sequence of related problems. This is useful when optimizing the output of a stochastic simulator that fails to provide derivative information, for which Bayesian optimization methods are well-suited. Solving sequences of related optimization problems arises when making several business decisions using one optimization model and input data collected over different time periods or markets. While many gradient-based methods can be warm started by initiating optimization at the solution to the previous problem, this warm start approach does not apply to Bayesian optimization methods, which carry a full metamodel of the objective function from iteration to iteration. Our approach builds a joint statistical model of the entire collection of related objective functions, and uses a value of information calculation to recommend points to evaluate.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.