Skip to content

Conversation

@waleedlatif1
Copy link
Collaborator

Summary

  • added vllm provider

Type of Change

  • New feature

Testing

Tested manually

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link

vercel bot commented Nov 22, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Preview Comments Updated (UTC)
docs Skipped Skipped Nov 22, 2025 10:43pm

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Nov 22, 2025

Greptile Overview

Greptile Summary

This PR adds vLLM as a new self-hosted provider with OpenAI-compatible API support. The implementation follows the established patterns from the Ollama and OpenAI providers, providing comprehensive support for tool calling, streaming, response formats, and dynamic model discovery.

Key Changes:

  • New vLLM provider implementation (providers/vllm/index.ts) with full tool calling and streaming support
  • API route for dynamic model discovery from vLLM endpoints
  • Integration across all provider touchpoints (stores, hooks, UI components, blocks)
  • Environment configuration for VLLM_BASE_URL and optional VLLM_API_KEY
  • Proper handling of API key visibility (hidden for self-hosted vLLM like Ollama)
  • Official vLLM branding icon added to components and integrations page

Implementation Quality:

  • Follows existing provider architecture patterns consistently
  • Proper error handling and fallback to empty arrays
  • Client-side initialization guard to avoid CORS issues
  • Comprehensive logging throughout the flow
  • Maintains parity with OpenAI provider features (tool usage control, response formats, timing metrics)

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The implementation is well-structured and follows established patterns from existing providers (Ollama, OpenAI). All touchpoints are properly integrated, error handling is comprehensive, and the code maintains consistency with the existing architecture. No logical errors, security issues, or breaking changes detected.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
apps/sim/providers/vllm/index.ts 5/5 New vLLM provider implementation with OpenAI-compatible API, tool calling, and streaming support - follows established patterns from OpenAI and Ollama providers
apps/sim/app/api/providers/vllm/models/route.ts 5/5 API route to fetch available models from vLLM endpoint with proper error handling and caching
apps/sim/providers/utils.ts 5/5 Integrated vLLM provider into providers registry and added updateVLLMProviderModels helper, excluded from base model providers
apps/sim/blocks/blocks/agent.ts 5/5 Integrated vLLM models into agent block model selector and API key visibility logic
apps/sim/app/workspace/[workspaceId]/providers/provider-models-loader.tsx 5/5 Added vLLM to provider synchronization flow with updateVLLMProviderModels call

Sequence Diagram

sequenceDiagram
    participant User
    participant UI as Agent Block UI
    participant Loader as ProviderModelsLoader
    participant API as /api/providers/vllm/models
    participant vLLM as vLLM Server
    participant Provider as vllmProvider
    participant Store as ProvidersStore

    Note over User,Store: Initialization Flow
    Loader->>API: GET /api/providers/vllm/models
    API->>vLLM: GET /v1/models
    vLLM-->>API: Return {data: [{id: "model-name"}]}
    API-->>Loader: Return {models: ["vllm/model-name"]}
    Loader->>Store: updateVLLMProviderModels(models)
    Store->>Store: setProviderModels('vllm', models)
    
    Note over User,Store: Model Execution Flow
    User->>UI: Select vLLM model & configure agent
    UI->>Provider: executeRequest(request)
    Provider->>vLLM: POST /v1/chat/completions
    
    alt Streaming without tools
        vLLM-->>Provider: Stream chunks
        Provider-->>UI: Return StreamingExecution
        UI-->>User: Display streaming response
    else With tool calls
        vLLM-->>Provider: Response with tool_calls
        loop For each tool call
            Provider->>Provider: executeTool(toolName, params)
            Provider->>vLLM: POST /v1/chat/completions (with tool results)
        end
        vLLM-->>Provider: Final response
        Provider-->>UI: Return ProviderResponse
        UI-->>User: Display final result
    end
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

14 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@waleedlatif1 waleedlatif1 merged commit 6114c21 into staging Nov 22, 2025
9 checks passed
MagellaX added a commit to MagellaX/sim that referenced this pull request Nov 23, 2025
* Add vLLM self-hosted provider

* updated vllm to have pull parity with openai, dynamically fetch models

---------

Co-authored-by: MagellaX <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants