Feat: organization models #207

hiento09 · 2025-10-02T04:48:47Z

This pull request introduces a comprehensive set of changes to support AI model management and Kubernetes integration in the API gateway. The main additions are new domain models for AI models, a Kubernetes service for cluster and GPU management, and the necessary dependency injection wiring to expose these features via HTTP APIs. The changes are grouped below by theme.

AI Model Management:

Added a new models domain package (apps/jan-api-gateway/application/app/domain/organization/models/model.go) defining types for model creation, resource requirements, model status, filtering, and cluster/GPU resource summaries. This enables structured management of AI models within organizations.

Kubernetes Integration:

Introduced a new KubernetesService (apps/jan-api-gateway/application/app/infrastructure/kubernetes/kubernetes_service.go) that provides cluster connectivity, CRD checks, GPU node/resource discovery, and storage class validation. This service is essential for deploying and managing models on Kubernetes clusters.

Dependency Injection Wiring:

Registered the new ModelService and Kubernetes-related services (NewKubernetesService, NewModelDeploymentManager) in the service and infrastructure provider sets, making them available for use throughout the application. [1] [2]

HTTP API Exposure:

Added new HTTP route providers for models and Kubernetes APIs, enabling external access to model management and cluster status endpoints.

General Integration:

Updated imports in the relevant provider and route files to include the new model and Kubernetes modules, ensuring all new functionality is properly wired into the application. [1] [2]

Issue: #128

…an-server into feat/organization-models

hiento09 · 2025-10-02T04:58:56Z

Example: request

curl -X 'POST' \
  'http://localhost:64185/v1/organization/models' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer *****' \
  -H 'Content-Type: application/json' \
  -d '{
  "command": [
    "sh", "-c", "python3 -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 8000 --uvicorn-log-level warning --model janhq/Jan-v1-2509 --served-model-name jan-v1-4b --max-num-batched-tokens 16384 --enable-auto-tool-choice --max-model-len 131072 --tool-call-parser hermes --reasoning-parser qwen3 --compilation-config '\''{\"cudagraph_mode\":\"FULL_AND_PIECEWISE\",\"compile_sizes\":[1,2,4]}'\'' --async-scheduling --api-server-count 4"
  ],
  "description": "jan-v1-4b",
  "display_name": "jan-v1-4b",
  "gpu_count": 1,
  "image": "registry.menlo.ai/dockerhub/vllm/vllm-openai:v0.10.2",
  "initial_delay_seconds": 600,
  "name": "jan-v1-4b",
  "replicas": 1,
  "storage_size": 30,
  "tags": []
}'

Response:

{
  "model": {
    "id": "jan-v1-4b",
    "organization_id": 1,
    "display_name": "jan-v1-4b",
    "description": "jan-v1-4b",
    "status": "creating",
    "version": "",
    "requirements": {
      "cpu": "1",
      "memory": "2Gi",
      "gpu": {
        "min_vram": "8Gi",
        "preferred_vram": "16Gi",
        "gpu_type": "nvidia",
        "min_gpus": 1,
        "max_gpus": 1
      }
    },
    "namespace": "jan-models",
    "deployment_name": "jan-v1-4b",
    "service_name": "jan-v1-4b",
    "tags": [],
    "managed": true,
    "created_at": "2025-10-02T04:24:27.173970526Z",
    "updated_at": "2025-10-02T04:24:27.173970831Z",
    "created_by_user_id": "user_igjyupzhnmh56bh9x3hj2v80"
  }
}

hiento09 · 2025-10-09T03:07:05Z

Close due to domain route conflict

hiento09 added 8 commits September 30, 2025 16:46

feat: organization models

28ed5ee

feat: organization kubernetes

d195a64

chore: fix error not able to pull binami postgres image

5be0971

chore: add docs

a1487a4

Merge branch 'feat/organization-models' of github.com:menloresearch/j…

d581b88

…an-server into feat/organization-models

chore: always create SA

f04376f

chore: add middleware for models api

5408721

chore: refactor code

3cbe33c

hiento09 requested review from jjchen01 and locnguyen1986 October 2, 2025 04:48

hiento09 self-assigned this Oct 2, 2025

chore: refactor docs and code

b2defa0

hiento09 force-pushed the feat/organization-models branch from 6a3e45a to b2defa0 Compare October 2, 2025 04:51

hiento09 closed this Oct 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: organization models #207

Feat: organization models #207

Uh oh!

hiento09 commented Oct 2, 2025

Uh oh!

hiento09 commented Oct 2, 2025

Uh oh!

hiento09 commented Oct 9, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Feat: organization models #207

Feat: organization models #207

Uh oh!

Conversation

hiento09 commented Oct 2, 2025

Uh oh!

hiento09 commented Oct 2, 2025

Uh oh!

hiento09 commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hiento09 commented Oct 9, 2025 •

edited

Loading