Frontend and backend API proxy for SwissAI LLM serving. For examples on how to launch models, see model-launch repo.
Live at:
- Prod: serving.swissai.svc.cscs.ch
- Dev: servingdev.swissai.svc.cscs.ch
- Local: with
make run
o
┌─────────────────┐ /|\ curl / python SDK
│ OpenWebUI │ / \
└────────┬────────┘ |
│ │
│ POST /v1/chat/completions
│ │
▼ ▼
┌─────────────────────────┐
│ serving-api │ auth + proxy (this repo)
└─────────────────────────┘
│
│
▼
┌─────────────────┐
│ OpenTela │ P2P routing → model=apertus-...
└────────┬────────┘
│
▼
┌─────────────────┐
│ vllm/sglang │ model inference (GPU)
└─────────────────┘
backend/ # Python API proxy (FastAPI) — auth, caching, routing
frontend/ # web UI (Astro + Svelte)
meta/ # example Dockerfiles, example k8s manifests, build scripts
OpenTela (formerly OCF / "Open Compute Framework") is maintained upstream at eth-easl/OpenTela. We maintain a fork at swiss-ai/OpenTela to control deployments to dev+prod.
make install # install backend dependencies
make run # start backend on :8080