id | title |
---|---|
overview |
Overview |
This overview describes the basic components of BoTorch and how they work together. For a high-level view of what BoTorch tries to achieve in more abstract terms, please see the Introduction.
At a high level, the problem underlying Bayesian Optimization (BayesOpt) is to
maximize some expensive-to-evaluate black box function
Bayesian Optimization is a general approach to adaptively select these test
points (or batches of test points to be evaluated in parallel) that allows for
a principled trade-off between evaluating
In order to optimize
The surrogate model for
BoTorch provides first-class support for GPyTorch, a package for scalable GPs and Bayesian deep learning implemented in PyTorch.
While GPs have been a very successful modeling approach, BoTorch's support for
MC-sampling based acquisition functions makes it straightforward to also use
other model types. In particular, BoTorch makes no particular assumptions
on what kind of model is being used, so long as is able to produce samples from
a posterior over outputs given an input
Posteriors represent the "belief" a model has about the function values at a point (or set of points), based on the data it has been trained with. That is, the posterior distribution over the outputs conditional on the data observed so far.
When using a GP model, the posterior is given explicitly as a multivariate Gaussian (fully parameterized by its mean and covariance matrix). In other cases, the posterior may be implicit in the model and not easily described by a small set of parameters.
BoTorch abstracts away from the particular form of the posterior by providing a
simple Posterior
API that only requires implementing an rsample()
method for
sampling from the posterior. For more details, please see
Posteriors.
Acquisition functions are heuristics employed to evaluate the usefulness of one of more design points for achieving the objective of maximizing the underlying black box function.
Some of these acquisition functions have closed-form solutions under Gaussian posteriors, but many of them (especially when assessing the joint value of multiple points in parallel) do not. In the latter case, one can resort to using Monte-Carlo (MC) sampling in order to approximate the acquisition function.
BoTorch supports both analytic as well as (quasi-) Monte-Carlo based acquisition
functions. It provides an AcquisitionFunction
API that abstracts away from the
particular type, so that optimization can be performed on the same objects.
Please see Acquisition Functions for additional information.
The idea behind using Monte-Carlo sampling for evaluating acquisition functions is simple: instead of computing an (intractable) expectation over the posterior, we sample from the posterior and use the sample average as an approximation.
To give additional flexibility in the case of MC-based acquisition functions,
BoTorch provides the option of transforming the output(s) of the model through
an Objective
module, which returns a one-dimensional output that is passed to
the acquisition function. The MCAcquisitionFunction
class defaults its
objective to IdentityMCObjective
, which simply returns the last dimension of
the model output. Thus, for the standard use case of a single-output GP that
directly models the black box function Objective
, see
Objectives.
The re-parameterization trick (see e.g. 1, 2)
can be used to write the posterior distribution as a deterministic
transformation of an auxiliary random variable
In BoTorch, base samples are constructed using an MCSampler
object, which
provides an interface that allows for different sampling techniques.
IIDNormalSampler
utilizes independent standard normal draws, while
SobolQMCNormalSampler
uses quasi-random, low-discrepancy "Sobol" sequences as
uniform samples which are then transformed to construct quasi-normal samples.
Sobol sequences are more evenly distributed than i.i.d. uniform samples and tend
to improve the convergence rate of MC estimates of integrals/expectations.
We find that Sobol sequences substantially improve the performance of MC-based
acquisition functions, and so SobolQMCNormalSampler
is used by default.
For more details, see Monte-Carlo Samplers.