Fix prediction slowdown from eager kernel evaluation#2749
Conversation
Keep kernel covariances lazy in `_get_test_prior_mean_and_covariances`. Let prediction strategies call `.evaluate_kernel()` only when needed.
|
This PR moves @SebastianAment Would this cause any friction on the BoTorch side? |
There was a problem hiding this comment.
Pull request overview
This PR addresses a prediction-time slowdown in ExactGP by avoiding eager kernel materialization for test covariances, allowing downstream prediction code paths to compute only what’s required (e.g., diagonals for variances).
Changes:
- Removed
.evaluate_kernel()calls for test covariances inExactGP._get_test_prior_mean_and_covariancesto keepK(test, test)lazy. - Added targeted
evaluate_kernel()calls inside specific prediction strategies where concrete operator types are required (e.g., KISS/SGPR). - Updated typing in
ExactGPto reflect that test covariances are returned asLinearOperators.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| gpytorch/models/exact_gp.py | Keeps test covariances lazy during posterior prediction and updates return type annotations. |
| gpytorch/models/exact_prediction_strategies.py | Evaluates kernels in specific strategies when concrete operator types are required for downstream logic. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@Balandat @gpleiss @jacobrgardner Can I get a stamp on this PR? A few users have reported OOM and noticeable slowdowns in test-time when the batch size is large (e.g., #2736 and #2754). The cause is that the recent changes in the prediction strategy materialize the covariance matrix on test data P.S. This PR may cause some friction on the BoTorch side (specifically |
I recall the placement of the kernel evaluation requiring some care, but don't recall the specifics. Have you tested in on a GPU? I can pull this in internally to run the BoTorch GPU suite tomorrow. |
|
@SebastianAment Thanks for looking into this! I ran the tests on a RTX 3090 GPU and they seem fine. There are indeed a few test failures on GPUs, but they are not related to this PR (or any recent chages in the prediction strategies). There's a numerical issue in the test case |
This fixes the test-time slowdown for batched inputs as reported in #2736
Root Cause
Note that
.evaluate_kernelis called for test covariances inExactGP._get_test_prior_mean_and_covariances. This would materialize theK(test, test)matrix explicitly (e.g., RBF and Matern kernels), even in cases when only predictive variances are needed.gpytorch/gpytorch/models/exact_gp.py
Lines 426 to 427 in ee0564b
The Fix
Remove
.evaluate_kernelcalls inExactGP. The prediction strategies should call.evaluated_kernelwhen needed.