Skip to content

[Feature]: Add /v1/models endpoint support for Disaggregated #12097

@zhouliang5266

Description

@zhouliang5266

🚀 The feature, motivation and pitch

Problem

The OpenAIDisaggServer (Disaggregated Proxy) does not implement the /v1/models endpoint, causing incompatibility with standard OpenAI API clients such as OpenAI Python SDK,
LangChain, and other tools that rely on this endpoint for model discovery.

Current behavior:

curl http://localhost:8100/v1/models
Returns: {"detail":"Not Found"}

Expected behavior:

curl http://localhost:8100/v1/models
Returns: {"object":"list","data":[{"id":"model-name","object":"model","created":...,"owned_by":"tensorrt_llm"}]}

Solution

Add /v1/models endpoint support to OpenAIDisaggServer. The endpoint extracts the model name from the server configuration's model path and returns a ModelList response compatible
with the OpenAI API specification.

Changes

File modified: tensorrt_llm/serve/openai_disagg_server.py

  1. Add route registration in register_routes():
    self.app.add_api_route("/v1/models", self.get_models, methods=["GET"])

  2. Add get_models() method:
    async def get_models(self) -> JSONResponse:
    """Return model list compatible with OpenAI API /v1/models endpoint.

    This endpoint is added for compatibility with OpenAI API clients that
    require /v1/models to be available (e.g., OpenAI Python SDK, LangChain).
    """
    from tensorrt_llm.serve.openai_protocol import ModelList, ModelCard

    model_id = "unknown"
    if self._config.server_configs:
    model_path = self._config.server_configs[0].other_args.get("model", "")
    if model_path:
    model_id = model_path.rstrip("/").split("/")[-1]

    model_list = ModelList(data=[ModelCard(id=model_id)])
    return JSONResponse(content=model_list.model_dump())

Testing

Test the new endpoint
curl http://localhost:8100/v1/models
Expected response: {"object":"list","data":[{"id":"","object":"model","created":,"owned_by":"tensorrt_llm"}]}

Compatibility

  • ✅ No impact on existing functionality
  • ✅ Backward compatible
  • ✅ Compatible with OpenAI Python SDK and other standard clients

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.

Metadata

Metadata

Assignees

Labels

Disaggregated serving<NV>Deploying with separated, distributed components (params, kv-cache, compute). Arch & perf.OpenAI APItrtllm-serve's OpenAI-compatible API: endpoint behavior, req/resp formats, feature parity.feature requestNew feature or request. This includes new model, dtype, functionality support

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions