-
-
Notifications
You must be signed in to change notification settings - Fork 72
Multioutput proposal #575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Multioutput proposal #575
Conversation
Thomaspinder/fix graph kernel
|
Thank you for opening your first PR into GPJax! If you have not heard from us in a while, please feel free to ping You can also join us on For details on testing, writing docs, and our review process, We strive to be a welcoming and open project. Please follow our |
|
Thanks for the detail here @mathDR - give me a few days to review and provide comments. |
|
Great @thomaspinder I am also working on an mvp for the multilatent prior. When I get that up I will post it here. @daniel-dodd any thoughts/comments you want to make? |
|
Some questions that are arising as I am building this out: Then if we have a single kernel (i.e. When we have a full For input data:
We should allow for all of these in |
|
Thanks for putting this together @mathDR . Some unstructured thoughts.
mixing = Real(jnp.eye(num_outputs, rank))
latent = [gpx.kernels.RBF(lengthscale=...) for _ in range(rank)]
kernel = gpx.kernels.MultiOutputKernel(
latent_kernels=latent,
mixing=mixing,
num_outputs=num_outputs,
)
meanf = gpx.mean_functions.Zero(output_dims=num_outputs)
prior = gpx.gps.MultiOutputPrior(mean_function=meanf, kernel=kernel)What do you think? I think this allows us to repurpose a lot of your code whilst keeping things very close to the maths. |
|
Okay for this simple case (one kernel repeated for each output), we can use the following: If our covariance to the above multivariate normal was just |
|
In the general case with |
|
Okay @thomaspinder I like your idea, but I would say we should add the mixing matrix (W) to the prior: then the "kernel" produces |
|
Also, not for this PR, but when we allow for variational inference, we will have a new So we would have a For the variational model(s) we would need to specify for the latter two kernels if we want the inducing points "shared" among all of the kernels, or each kernel having its own set of inducing points. But as I said, that is for a later PR. |
|
For this PR: we would implement:
Thoughts? |
|
OK. I'm good with leaving the mixing component in the prior, this is fine. I also align with your proposal for this PR. |
MultiOutput GP Proposal.
In the spirit of MultiOutput Processes like GPFlow, we will take the same problem statement:
Problem Statement
We will consider a regression problem for functions$f: \mathbb{R}^D \rightarrow \mathbb{R}^Q$ . We assume that the dataset is of the form $(X_1, f_1), \dots, (X_Q, f_Q)$ , that is, we may observe different inputs for each output dimension.
Here we assume a model of the form:
$$f(x) = W g(x), $$ $g(x) \in \mathbb{R}^L$ , $f(x) \in \mathbb{R}^Q$ and $W \in \mathbb{R}^{Q \times L}$ . We assume that the outputs of $g$ are uncorrelated, and that by mixing them with $W$ they become correlated.
where
Note, we have two options for$g$ :
In a following PR we can discuss variational GPs, wherein we need to further suboption for the inducing inputs of$g$ :
The notation is as follows:
Phase 1:
We write a multioutput kernel that initially just takes in a single kernel. Mimicing GPFlow again we could have something like:
we could then add another
IndependentKernel where we pass in aSequenceof kernels and apply each to anum_latent_gpsdimension.Phase 2
We add a
MultiLatentPriorclass (probably just start with Conjugate) where instead of returning aGaussianDistributionclass, we develop aMatrixNormalDistributionclass that inherits from numpyro's MatrixNormal that will allow us to sample.I am still fleshing this out, but overall it seems as if we need to add a Multikernel and a multi prior of some types.
Thoughts?