Feat/peer metadata display by robmsmt · Pull Request #21 · swiss-ai/serving-api

robmsmt · 2026-05-18T11:49:32Z

No description provided.

Backend (model_service): pass through hostname, version, status, labels, and convenience pulls (launched_by, slurm_job_id, worker_group_id, framework, started_at) for each DNT peer. Also surface metrics-only follower peers (no service, but with worker_group_id) so multi-node replicas can be reconstructed in aggregation. Frontend (ModelList): group raw peers by worker_group_id to count replicas distinctly from peers/nodes. Headline now reads "Available Models X, Replicas Y" when the two diverge. Frontend (ModelCard): clicking the card now expands inline instead of opening OpenWebUI. The expansion shows: - Open in OpenWebUI button (the prior click behaviour) - Per-replica monospace block with model, launched_by, slurm_job_id, started_at, framework, version, head + follower hostnames - Topology header (e.g. "2 nodes × 4x GH200") - Per-replica extra-labels block for anything else OCF carries Fixtures: snapshot of live prod /dnt/table + a script that synthesises the post-v0.0.6 shape (hostname/version/status/labels) by adding a multi-node replica demo (shared worker_group_id, one head + one metrics-only follower) so the new code paths have a realistic test.

Settings: ocf_head_addr → otela_head_addr, ocf_fixture_path → otela_fixture_path. DEPLOY NOTE: this changes the env var names from OCF_HEAD_ADDR / OCF_FIXTURE_PATH to OTELA_HEAD_ADDR / OTELA_FIXTURE_PATH. Ops side must update before/with this deploy or /v1/models* will hit an empty endpoint. Comments, README, k8s manifests, and the in-repo guides now refer to "OpenTela" rather than "OCF". External image tag (ghcr.io/researchcomputer/ocf:*) and the in-binary mount path (/root/.ocfcore/keys) are untouched — both are dictated by upstream and would need a coordinated rename there first. Fixture mode: when OTELA_FIXTURE_PATH is set, /v1/models* reads that JSON file instead of HTTP-getting OTELA_HEAD_ADDR/v1/dnt/table. Used for iterating on the new model-card expansion UI against the synthesised post-v0.0.6 payload before the binary actually ships.

Use pydantic AliasChoices so OCF_HEAD_ADDR / OCF_FIXTURE_PATH still populate the renamed settings. Deployments can migrate on their own schedule without a synchronized cut-over. When both legacy and canonical names are set, the canonical OTELA_* wins — a partial migration shouldn't silently keep the legacy value in force.

…-ai/serving-api into feat/peer-metadata-display

Upgraded DNT fixture now includes framework_args (the second monospace block in the card expansion), expires_at (SLURM time limit applied to started_at), slurm_reservation (mixed in for some launches), and varied started_at values spread across several hours so the UI shows a realistic mix of ages. Note: framework_args isn't in the opentela --label set we shipped yet — this fixture preempts a planned follow-up patch there. Until that ships, real prod data won't carry framework_args; serving-api just hides the row. ModelCard expansion: filter out empty rows so the legacy / pre-v0.0.6 case shows just what's known (peer ids + model) instead of a wall of "?" placeholders. When no labels exist at all, render a small amber hint pointing at the v0.0.6 requirement instead of silently rendering an empty card.

Same as `make run` but forces OTELA_FIXTURE_PATH at the synthesised upgraded fixture, so the model card UI shows the v0.0.6-shape payload (hostname, version, labels, multi-node demo) without depending on live prod state or whatever's in the developer's .env. Use `make run` to hit live prod, `make dummy-run` to iterate on the UI.

robmsmt added 9 commits May 17, 2026 16:12

makefile launch docker

174b389

Merge branch 'feat/peer-metadata-display' of https://github.com/swiss…

b71bd0f

…-ai/serving-api into feat/peer-metadata-display

use updated url, add time info in replica-panel for info on node

2d3940d

format

d1c2598

robmsmt merged commit 0582bc0 into main May 18, 2026
2 checks passed

robmsmt deleted the feat/peer-metadata-display branch May 18, 2026 11:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/peer metadata display#21

Feat/peer metadata display#21
robmsmt merged 9 commits into
mainfrom
feat/peer-metadata-display

robmsmt commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

robmsmt commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant