Skip to content

Fix prediction slowdown from eager kernel evaluation#2749

Open
kayween wants to merge 2 commits into
mainfrom
fix-prediction-slowdown
Open

Fix prediction slowdown from eager kernel evaluation#2749
kayween wants to merge 2 commits into
mainfrom
fix-prediction-slowdown

Conversation

@kayween
Copy link
Copy Markdown
Collaborator

@kayween kayween commented May 1, 2026

This fixes the test-time slowdown for batched inputs as reported in #2736

Root Cause

Note that .evaluate_kernel is called for test covariances in ExactGP._get_test_prior_mean_and_covariances. This would materialize the K(test, test) matrix explicitly (e.g., RBF and Matern kernels), even in cases when only predictive variances are needed.

test_test_covar = joint_covar[..., num_train:, num_train:].evaluate_kernel()
test_train_covar = joint_covar[..., num_train:, :num_train].evaluate_kernel()

The Fix

Remove .evaluate_kernel calls in ExactGP. The prediction strategies should call .evaluated_kernel when needed.

Keep kernel covariances lazy in `_get_test_prior_mean_and_covariances`. Let prediction strategies call `.evaluate_kernel()` only when needed.
@kayween
Copy link
Copy Markdown
Collaborator Author

kayween commented May 1, 2026

This PR moves .evaluate_kernel calls from ExactGP to prediction strategies. In particular, it changes _get_test_prior_mean_and_covariances, which LatentKroneckerGP overrides.

@SebastianAment Would this cause any friction on the BoTorch side?

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a prediction-time slowdown in ExactGP by avoiding eager kernel materialization for test covariances, allowing downstream prediction code paths to compute only what’s required (e.g., diagonals for variances).

Changes:

  • Removed .evaluate_kernel() calls for test covariances in ExactGP._get_test_prior_mean_and_covariances to keep K(test, test) lazy.
  • Added targeted evaluate_kernel() calls inside specific prediction strategies where concrete operator types are required (e.g., KISS/SGPR).
  • Updated typing in ExactGP to reflect that test covariances are returned as LinearOperators.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
gpytorch/models/exact_gp.py Keeps test covariances lazy during posterior prediction and updates return type annotations.
gpytorch/models/exact_prediction_strategies.py Evaluates kernels in specific strategies when concrete operator types are required for downstream logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread gpytorch/models/exact_prediction_strategies.py Outdated
Comment thread gpytorch/models/exact_gp.py
@kayween
Copy link
Copy Markdown
Collaborator Author

kayween commented May 28, 2026

@Balandat @gpleiss @jacobrgardner Can I get a stamp on this PR?

A few users have reported OOM and noticeable slowdowns in test-time when the batch size is large (e.g., #2736 and #2754). The cause is that the recent changes in the prediction strategy materialize the covariance matrix on test data K(X_test, X_test).

P.S. This PR may cause some friction on the BoTorch side (specifically LatentKroneckerGP has overriden part of the prediction strategy).

@SebastianAment
Copy link
Copy Markdown
Contributor

This PR moves .evaluate_kernel calls from ExactGP to prediction strategies. In particular, it changes _get_test_prior_mean_and_covariances, which LatentKroneckerGP overrides.

@SebastianAment Would this cause any friction on the BoTorch side?

I recall the placement of the kernel evaluation requiring some care, but don't recall the specifics. Have you tested in on a GPU? I can pull this in internally to run the BoTorch GPU suite tomorrow.

@kayween
Copy link
Copy Markdown
Collaborator Author

kayween commented May 29, 2026

@SebastianAment Thanks for looking into this! I ran the tests on a RTX 3090 GPU and they seem fine.

There are indeed a few test failures on GPUs, but they are not related to this PR (or any recent chages in the prediction strategies). There's a numerical issue in the test case test_sgpr_mean_abs_error_cuda and a couple of miscellaneous errors in priors' tests. AFAIK, they have been around for quite a long time; in particular, they also fail in the main branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants