MultiTaskGP observation_noise question #2079

matthewcarbone · 2023-11-01T15:41:28Z

matthewcarbone
Nov 1, 2023

Hi there,

Quick question about the MultiTaskGP. Unlike the SingleTaskGP or FixedNoiseGP, the MultiTaskGP.posterior object does not allow for observation_noise=True. Practically speaking, this seems to scale the posterior variance by some constant, and I admit I do not fully understand everything going on under the hood here. Essentially, the SingleTaskGP and FixedNoiseGP both produce what appears to be the statistically "correct" variance estimates, whereas MultiTaskGP produces something scaled slightly incorrectly.

My practical questions:

How should one interpret the difference in the outputs here? (observation_noise true vs. false)
Does the fact that observation_noise is unchangeably False for the MultiTaskGP make any difference in practical experimentation?

More concretely, consider the following, where X contains observations from "low fidelity" and "high fidelity" experiments. We want to effectively use the low fidelity experiments as a prior for making the high fidelity experiments more accurate.

First we fit the model:

sigma = 0.5
model = MultiTaskGP(
    torch.tensor(X),
    torch.tensor(y.reshape(-1, 1)),
    train_Yvar=torch.tensor(np.ones_like(y).reshape(-1, 1)) * sigma**2,
    covar_module=gpytorch.kernels.ScaleKernel(gpytorch.kernels.MaternKernel()),
    task_feature=-1
)
mll = ExactMarginalLogLikelihood(likelihood=model.likelihood, model=model)

model.train()
mll.train()
fit_gpytorch_mll(mll)

Next, we can evaluate and use Bayesian Optimization to find the next point (consider beta large to just be pure exploration for the sake of argument).

model.eval()
mll.eval()

from botorch.acquisition import UpperConfidenceBound
from botorch.acquisition.objective import ScalarizedPosteriorTransform
from botorch.optim import optimize_acqf

# We only consider the high fidelity model, hence weights [0, 1] in the transform
weights = torch.tensor([0.0, 1.0])
transform = ScalarizedPosteriorTransform(weights=weights)

ucb = UpperConfidenceBound(model, beta=10000, posterior_transform=transform)

next_point, value = optimize_acqf(
    ucb,
    bounds=torch.FloatTensor([0, 1]).reshape(2, 1),
    q=1,
    num_restarts=10,
    raw_samples=10,
)
# done

Is this procedure still sensible even though observation_noise=False is set under the hood here?

Thanks!

Answered by Balandat

Nov 1, 2023

Sorry for the delayed response here.

How should one interpret the difference in the outputs here? (observation_noise true vs. false)

The difference is that if observation_noise=False this produces the posterior over the latent function values f (what we're trying to estimate), and if observation_noise=True this produces the posterior predictive over the observations Y (which under the modeling assumption are f + eps with eps iid normal draws.

Does the fact that observation_noise is unchangeably False for the MultiTaskGP make any difference in practical experimentation?

It shouldn't really, at least for most intents and purposes. Where this difference becomes meaningful is in settings …

View full answer

Balandat · 2023-11-01T22:48:20Z

Balandat
Nov 1, 2023
Collaborator

Sorry for the delayed response here.

How should one interpret the difference in the outputs here? (observation_noise true vs. false)

The difference is that if observation_noise=False this produces the posterior over the latent function values f (what we're trying to estimate), and if observation_noise=True this produces the posterior predictive over the observations Y (which under the modeling assumption are f + eps with eps iid normal draws.

Does the fact that observation_noise is unchangeably False for the MultiTaskGP make any difference in practical experimentation?

It shouldn't really, at least for most intents and purposes. Where this difference becomes meaningful is in settings where we need to reason about the distribution of observed outcomes, typically something that is done when "fantasizing" in the context of look-ahead acquisition functions such as the knowledge gradient.

Hope that clears things up!

3 replies

Balandat Nov 1, 2023
Collaborator

As to why this isn't exposed in the MultiTaskGP - there isn't really any specific reason other than this being kind of annoying from an engineering perspective (need to deal with different observation noise levels for different tasks), so we haven't added support for this yet.

matthewcarbone Nov 2, 2023
Author

@Balandat I wouldn't call < 1 day a delayed, response. In fact thank you for such a quick response!

The difference is that if observation_noise=False this produces the posterior over the latent function values f (what we're trying to estimate), and if observation_noise=True this produces the posterior predictive over the observations Y (which under the modeling assumption are f + eps with eps iid normal draws.

Yeah interestingly I have been playing with BoTorch and GPyTorch for over a year now and just today I saw precisely where the observation noise makes significant effect. When this is switched on, each of the samples essentially includes the noise term itself, which is averaged out in the mean predictions and only shows up slightly in the variance.

As to why this isn't exposed in the MultiTaskGP - there isn't really any specific reason other than this being kind of annoying from an engineering perspective (need to deal with different observation noise levels for different tasks), so we haven't added support for this yet.

Understood!

Thanks so much for the response! I'll continue on my way without worrying too much about observation_noise in the MultiTaskGPs.

Balandat Nov 2, 2023
Collaborator

I wouldn't call < 1 day a delayed, response

Ha, I think I meant to write this for a response on a different issue/discussion. Well, I take it back and highlight how quick my response was indeed :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MultiTaskGP observation_noise question #2079

{{title}}

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

MultiTaskGP observation_noise question #2079

matthewcarbone Nov 1, 2023

Replies: 1 comment · 3 replies

Balandat Nov 1, 2023 Collaborator

Balandat Nov 1, 2023 Collaborator

matthewcarbone Nov 2, 2023 Author

Balandat Nov 2, 2023 Collaborator

matthewcarbone
Nov 1, 2023

Replies: 1 comment 3 replies

Balandat
Nov 1, 2023
Collaborator

Balandat Nov 1, 2023
Collaborator

matthewcarbone Nov 2, 2023
Author

Balandat Nov 2, 2023
Collaborator