Skip to content

Epic: first-class SNI-rotation + pluggable attested 3p providers (FleetRouter + ProviderRegistry + VerificationEngine) #735

@Evrard-Nil

Description

@Evrard-Nil

Tracking epic for the cloud-api inference-stack refactor. Goal: (1) make SNI-rotation first-class via stable address-derived backend handles (fixes the GLM-5.1 "does not match any attested fingerprint" signature-fetch bug + the pin-eviction hazard), (2) a pluggable teep-style attested-provider abstraction behind a single registry, (3) a per-model attestation policy (NEAR + attested-3p fallback / attested-3p-only / non-attested) where fallback can never cross an attestation boundary.

Phases (each independently shippable; flag-gated; GLM-5.1 as canary)

  • P0 model-proxy: stable-handle SNI -b<hash> + select_backend_by_handle + /backends/list — nearai/model-proxy#38 (additive; deps: none)
  • P1 cloud-api: extract FleetRouter from VLlmProvider (still u64, flag-gated, characterization tests first) (deps: none)
  • P2 cloud-api: ProviderRegistry + capability handles; move the 5 NEAR-only no-ops off InferenceProvider; delete ExternalProvider attestation stubs (deps: P1)
  • P3 cloud-api: FleetRouter u64BackendHandle; per-handle FingerprintState; handle-based discovery (preserve pubkey harvest). Fixes the bug. (deps: P0 deployed to gpu11+gpu30, P1, P2)
  • P4 cloud-api: VerificationEngine factor pipeline + per-step traits + RawAttestation + Blocked(); port NEAR verbatim (signing_algo fix, shared codec, golden vectors) (deps: P2)
  • P5 cloud-api: DB migration — drop chk_external_provider_no_attestation; add attestation_policy + providers/model_providers(fallback_order) (deps: none; schema-only)
  • P6 cloud-api: loader-merge NEAR+3p under one model_id; ProviderTier vs RequiredPolicy filter in get_providers_with_fallback (fail-closed, within-tier RR) (deps: P2, P5)
  • P7 cloud-api: onboard first attested-3p (Venice) — ExternalProvider::new_pinned + report-data verifier + concrete supply-chain verifier (deps: P3, P4, P6)

Design: FleetRouter + ProviderRegistry + VerificationEngine (capability-struct provider, teep-style factor pipeline, content-addressed handles, split live-tier vs required-policy). Reference: github.com/13rac1/teep.

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestrfcDesign/architecture RFC

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions