Abstract:Bayesian optimisation (BO) uses probabilistic surrogate modelsusually Gaussian processes (GPs) -for the optimisation of expensive black-box functions. At each BO iteration, the GP hyperparameters are fit to previously-evaluated data by maximising the marginal likelihood. However, this fails to account for uncertainty in the hyperparameters themselves, leading to overconfident model predictions. This uncertainty can be accounted for by taking the Bayesian approach of marginalising out the model hyperparameters.… Show more
“…Bar heights correspond to the proportion of times each model and scalarisation combination was the best on each test problem. As can be seen from the figure, MBORE with XGB and our novel PHC scalariser (15) has the best overall performance for both benchmarks. Surprisingly, MBORE with the MLP classification method performs worse than the best performing method on all the 63 WFG test problems.…”
Section: Synthetic Benchmarksmentioning
confidence: 87%
“…In this work we use a Matérn 5/2 kernel, as recommended for modelling realistic functions [69,73]. The kernel's hyperparameters 𝜽 are learnt via maximising the log marginal likelihood [15,58].…”
Section: Bayesian Optimisationmentioning
confidence: 99%
“…𝑑 ∈ {20, 50, 100}, version of the WFG benchmark is also evaluated to assess optimisation performance in more challenging scenarios. Experiments are repeated for the scalarisation methods discussed in Section 2.2.1: augmented Tchebycheff (AT) [44], hypervolume improvement (HYPI) [57], dominance ranking (DomRank) [57], and our novel scalariser, Pareto hypervolume contribution (PHC) (15).…”
Optimisation problems often have multiple conflicting objectives that can be computationally and/or financially expensive. Monosurrogate Bayesian optimisation (BO) is a popular model-based approach for optimising such black-box functions. It combines objective values via scalarisation and builds a Gaussian process (GP) surrogate of the scalarised values. The location which maximises a cheap-to-query acquisition function is chosen as the next location to expensively evaluate. While BO is an effective strategy, the use of GPs is limiting. Their performance decreases as the problem input dimensionality increases, and their computational complexity scales cubically with the amount of data. To address these limitations, we extend previous work on BO by density-ratio estimation (BORE) to the multi-objective setting. BORE links the computation of the probability of improvement acquisition function to that of probabilistic classification. This enables the use of state-of-the-art classifiers in a BO-like framework. In this work we present MBORE: multi-objective Bayesian optimisation by densityratio estimation, and compare it to BO across a range of synthetic and real-world benchmarks. We find that MBORE performs as well as or better than BO on a wide variety of problems, and that it outperforms BO on high-dimensional and real-world problems.
CCS CONCEPTS• Computing methodologies → Modeling methodologies; • Theory of computation → Gaussian processes; • Applied computing → Multi-criterion optimization and decision-making.
“…Bar heights correspond to the proportion of times each model and scalarisation combination was the best on each test problem. As can be seen from the figure, MBORE with XGB and our novel PHC scalariser (15) has the best overall performance for both benchmarks. Surprisingly, MBORE with the MLP classification method performs worse than the best performing method on all the 63 WFG test problems.…”
Section: Synthetic Benchmarksmentioning
confidence: 87%
“…In this work we use a Matérn 5/2 kernel, as recommended for modelling realistic functions [69,73]. The kernel's hyperparameters 𝜽 are learnt via maximising the log marginal likelihood [15,58].…”
Section: Bayesian Optimisationmentioning
confidence: 99%
“…𝑑 ∈ {20, 50, 100}, version of the WFG benchmark is also evaluated to assess optimisation performance in more challenging scenarios. Experiments are repeated for the scalarisation methods discussed in Section 2.2.1: augmented Tchebycheff (AT) [44], hypervolume improvement (HYPI) [57], dominance ranking (DomRank) [57], and our novel scalariser, Pareto hypervolume contribution (PHC) (15).…”
Optimisation problems often have multiple conflicting objectives that can be computationally and/or financially expensive. Monosurrogate Bayesian optimisation (BO) is a popular model-based approach for optimising such black-box functions. It combines objective values via scalarisation and builds a Gaussian process (GP) surrogate of the scalarised values. The location which maximises a cheap-to-query acquisition function is chosen as the next location to expensively evaluate. While BO is an effective strategy, the use of GPs is limiting. Their performance decreases as the problem input dimensionality increases, and their computational complexity scales cubically with the amount of data. To address these limitations, we extend previous work on BO by density-ratio estimation (BORE) to the multi-objective setting. BORE links the computation of the probability of improvement acquisition function to that of probabilistic classification. This enables the use of state-of-the-art classifiers in a BO-like framework. In this work we present MBORE: multi-objective Bayesian optimisation by densityratio estimation, and compare it to BO across a range of synthetic and real-world benchmarks. We find that MBORE performs as well as or better than BO on a wide variety of problems, and that it outperforms BO on high-dimensional and real-world problems.
CCS CONCEPTS• Computing methodologies → Modeling methodologies; • Theory of computation → Gaussian processes; • Applied computing → Multi-criterion optimization and decision-making.
“…136 Additionally, slice sampling, 137 adaptive importance sampling, 138 and entropy-based methods 139 have been used within Bayesian optimisation, where there is interest in moving beyond single-point estimates to fully-Bayesian approaches. 140,141 The main drawback with these methods is the high computational cost, and ideally they should only be employed when it will provide a substantial advantage over single point estimates. However, this condition cannot be known a priori .…”
In this work, we outline how methods from the energy landscapes field of theoretical chemistry can be applied to study machine learning models. Various applications are found, ranging from interpretability to improved model performance.
Gaussian processes (GPs) serve as powerful surrogate models in optimisation by providing a flexible datadriven framework for representing complex fitness landscapes. We provide an analysis of realisations drawn from GP models of fitness landscapes-which represent alternative coherent fits to the data-and use a network-based approach to investigate their induced landscape consistency. We consider the variation of constructed local optima networks (LONs: which provide a condensed representation of landscapes), analyse the fitness landscapes of GP realisations, and delve into the uncertainty associated with graph metrics of LONs. Our findings contribute to the understanding and practical application of GPs in optimisation and landscape analysis. Particularly that landscape consistency between GP realisations can vary considerably dependent on the model fit and underlying landscape complexity of the optimisation problem.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.