Commit 03a5a82
[API] Clear request-scoped cache in gpu-metrics endpoint (#9265)
* [API] Add on_gpu_metrics_collect plugin hook
Add a lifecycle hook to BasePlugin that the metrics server calls
before collecting GPU metrics. This allows plugins to sync
process-level state (e.g. KUBECONFIG) into the metrics server
process, which runs separately from request worker processes.
Without this, credentials uploaded via the credential manager
plugin only update the environment in the worker process that
handled the upload. The metrics server in the main process never
sees the change, requiring a pod restart for /gpu-metrics to
discover newly uploaded kubeconfigs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* [API] Load plugins in metrics server on startup
The metrics server runs as a separate uvicorn instance in a
background thread, so plugins loaded by the main API server are
not available in its context. Add a startup event that loads
plugins if they haven't been loaded yet, enabling plugin hooks
like on_gpu_metrics_collect to work.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* [API] Ensure KUBECONFIG includes credential manager path in metrics server
Instead of relying on plugin loading (which fails in the metrics
server since install() requires a FastAPI app context), directly
ensure KUBECONFIG includes the credential manager kubeconfig path
before each gpu-metrics collection.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* [API] Add /gpu-metrics-debug endpoint for diagnostics
Temporary debug endpoint to inspect plugin state, KUBECONFIG, and
discovered contexts from within the metrics server process.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* [API] Clear request-scoped cache before gpu-metrics collection
The metrics server runs as a daemon thread where request-scoped
caches (kubernetes API clients, context names) are never cleared
automatically. This causes stale results from boot time to persist
indefinitely — if a kubeconfig file didn't exist at boot, context
discovery caches the failure and never retries.
Call clear_request_level_cache() before each gpu-metrics scrape,
matching the pattern used by the billing daemon.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* [API] Enhance debug endpoint to show cache clear effect
Shows contexts before and after clearing request-scoped cache
to verify the stale cache hypothesis.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* [API] Clear request-scoped cache in gpu-metrics endpoint
The metrics server runs as a daemon thread where request-scoped
caches are never cleared automatically. This causes stale results
from boot time to persist — if a kubeconfig didn't exist at boot,
context discovery caches the failure and never retries.
Add clear_request_level_cache() before each gpu-metrics scrape,
matching the pattern used by other daemon threads (billing, gpu
healer).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent dddfe09 commit 03a5a82
1 file changed
Lines changed: 6 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| |||
199 | 200 | | |
200 | 201 | | |
201 | 202 | | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
202 | 208 | | |
203 | 209 | | |
204 | 210 | | |
| |||
0 commit comments