You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tracking epic for the cloud-api inference-stack refactor. Goal: (1) make SNI-rotation first-class via stable address-derived backend handles (fixes the GLM-5.1 "does not match any attested fingerprint" signature-fetch bug + the pin-eviction hazard), (2) a pluggable teep-style attested-provider abstraction behind a single registry, (3) a per-model attestation policy (NEAR + attested-3p fallback / attested-3p-only / non-attested) where fallback can never cross an attestation boundary.
Phases (each independently shippable; flag-gated; GLM-5.1 as canary)
P2 cloud-api: ProviderRegistry + capability handles; move the 5 NEAR-only no-ops off InferenceProvider; delete ExternalProvider attestation stubs (deps: P1)
P3 cloud-api: FleetRouteru64→BackendHandle; per-handle FingerprintState; handle-based discovery (preserve pubkey harvest). Fixes the bug.(deps: P0 deployed to gpu11+gpu30, P1, P2)
P4 cloud-api: VerificationEngine factor pipeline + per-step traits + RawAttestation + Blocked(); port NEAR verbatim (signing_algo fix, shared codec, golden vectors) (deps: P2)
P5 cloud-api: DB migration — drop chk_external_provider_no_attestation; add attestation_policy + providers/model_providers(fallback_order)(deps: none; schema-only)
P6 cloud-api: loader-merge NEAR+3p under one model_id; ProviderTier vs RequiredPolicy filter in get_providers_with_fallback (fail-closed, within-tier RR) (deps: P2, P5)
Tracking epic for the cloud-api inference-stack refactor. Goal: (1) make SNI-rotation first-class via stable address-derived backend handles (fixes the GLM-5.1
"does not match any attested fingerprint"signature-fetch bug + the pin-eviction hazard), (2) a pluggable teep-style attested-provider abstraction behind a single registry, (3) a per-model attestation policy (NEAR + attested-3p fallback / attested-3p-only / non-attested) where fallback can never cross an attestation boundary.Phases (each independently shippable; flag-gated; GLM-5.1 as canary)
-b<hash>+select_backend_by_handle+/backends/list— nearai/model-proxy#38 (additive; deps: none)FleetRouterfromVLlmProvider(stillu64, flag-gated, characterization tests first) (deps: none)ProviderRegistry+ capability handles; move the 5 NEAR-only no-ops offInferenceProvider; deleteExternalProviderattestation stubs (deps: P1)FleetRouteru64→BackendHandle; per-handleFingerprintState; handle-based discovery (preserve pubkey harvest). Fixes the bug. (deps: P0 deployed to gpu11+gpu30, P1, P2)VerificationEnginefactor pipeline + per-step traits +RawAttestation+Blocked(); port NEAR verbatim (signing_algo fix, shared codec, golden vectors) (deps: P2)chk_external_provider_no_attestation; addattestation_policy+providers/model_providers(fallback_order)(deps: none; schema-only)ProviderTiervsRequiredPolicyfilter inget_providers_with_fallback(fail-closed, within-tier RR) (deps: P2, P5)ExternalProvider::new_pinned+ report-data verifier + concrete supply-chain verifier (deps: P3, P4, P6)Design: FleetRouter + ProviderRegistry + VerificationEngine (capability-struct provider, teep-style factor pipeline, content-addressed handles, split live-tier vs required-policy). Reference: github.com/13rac1/teep.
🤖 Generated with Claude Code