Skip to content

feat: Add AIBrix Chat - a frontend demo portal for LLM inferenceΒ #1951

@Jeffwan

Description

@Jeffwan

πŸš€ Feature Description and Motivation

AIBrix provides powerful infrastructure for LLM inference, but we lack a user-facing demo to showcase its capabilities. Currently, users need to interact with AIBrix through API calls, or third-party clients, which creates friction when:

  • Live demos β€” Presenting AIBrix at conferences, meetups, or internal reviews requires a polished UI
  • End-to-end validation β€” Testing the full stack (gateway routing, model serving, load balancing) benefits from a real frontend client
  • Onboarding β€” A working chat UI makes it easier for new contributors to understand the user-facing value of the infrastructure

Use Case

Add a lightweight chat portal under apps/chat/web/ that serves as both a demo and a functional frontend for AIBrix-managed LLM endpoints.

What it includes:

  • Chat interface with conversation management (create, search, delete, move)
  • Project organization for grouping related conversations
  • Model selector supporting multiple backends
  • Sidebar navigation with chat history
  • Dark-themed, responsive UI built with React + Vite + Tailwind CSS

Architecture:

  apps/
  β”œβ”€β”€ chat/
   β”‚   β”œβ”€β”€ web/          # React frontend
   β”‚   └── api/          # Backend proxy for LLM endpoints (planned)

Proposed Solution

Why in-repo (not a separate project)?

  1. Dogfooding β€” The chat portal consumes AIBrix gateway APIs directly, validating routing algorithms, model selection, and streaming behavior from a real client
  2. Faster iteration β€” Co-locating with infrastructure code means API and frontend changes can land together
  3. Visibility β€” Contributors are more likely to maintain something that lives alongside the core project
  4. Demo-ready β€” A single repo clone gives users everything needed for a complete demo

Task break down

Models and vLLM adaptatio

  • Test some ASR models (e.g.Qwen3-ASR) and make sure vLLM or vLLM-omni support it.
  • Test some TTS models and make sure vLLM or vLLM-omni support it.
  • Test Image Edit or Image Generation model and make sure vLLM or vLLM-omni support it.
  • Test Video Generation model (e.g. WAN-2.1) and make sure vLLM or vLLM-omni support it.
  • Enable tool search/web search for LLM models.

Platform

  • Build lightweight frontend framework (nodejs)
  • Build python backend skeleton (python)
  • Make sure endpoint is compatible between cloud API and self-hosted models (via aibrix), we do not necessarily need to host every model, each model endpoint should be fully replaced by cloud api.
  • Prepare intent router or intent classification component
  • Prepare Prompt Enhancer module
  • Prepare AIBrix deployment yaml for all the models
  • Build applications -> CI, Docker, Deployment

Features

  • Finish Job Queue for near real time submission
  • (Optional) Support object storage uploads if user configure the backend storage
  • (Optional) Introduce SQLite to persist the conversation dat
  • Make sure streaming, reasoning are working well with the web UI.
  • Enable user Login using github or simple login
  • Enable user - level token based rate limiting

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions