Skip to content

ethpandaops/vllm-agent-sdk-go

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vllm-agent-sdk-go

Go SDK for building agentic applications backed by a local or self-hosted vLLM OpenAI-compatible server.

  • Package: vllmsdk
  • Default backend: http://127.0.0.1:8000/v1

Install

go get github.com/ethpandaops/vllm-agent-sdk-go

Configuration

The SDK resolves configuration from explicit options first, then environment variables, then defaults.

Environment Variables

Variable Description Default
VLLM_BASE_URL vLLM server base URL http://127.0.0.1:8000/v1
VLLM_API_KEY Bearer auth token (optional, only if your server enforces auth) (none)
VLLM_MODEL Model name (none — must be set via env or WithModel())
VLLM_AGENT_SESSION_STORE_PATH Local session store directory (none)

Example-only variables (not resolved by the core SDK):

Variable Description Default
VLLM_IMAGE_MODEL Image-capable model for multimodal examples QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
VLLM_VISION_MODEL Vision model for multimodal input examples Falls back to VLLM_IMAGE_MODEL, then VLLM_MODEL
VLLM_IMAGE_OUTPUT_DIR Directory for saving generated images (none)

Option Precedence

All settings follow the same resolution order:

  1. Explicit option (e.g. WithBaseURL(...), WithAPIKey(...), WithModel(...))
  2. Environment variable (VLLM_BASE_URL, VLLM_API_KEY, VLLM_MODEL)
  3. Built-in default (where applicable)

Developer Workflow

The repo ships a sibling-style Makefile:

  • make test runs race-enabled package tests with coverage output.
  • make test-integration runs ./integration/... with -tags=integration.
  • make audit runs the aggregate quality gate.

Integration setup:

  • Set VLLM_BASE_URL or default to http://127.0.0.1:8000/v1.
  • Set VLLM_MODEL to the model served by your vLLM instance.
  • Set VLLM_API_KEY if your vLLM server enforces bearer auth.
  • Integration tests skip when the local vLLM server is unavailable.

Quick Start

package main

import (
	"context"
	"fmt"
	"time"

	vllmsdk "github.com/ethpandaops/vllm-agent-sdk-go"
)

func main() {
	ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second)
	defer cancel()

	// Model resolved from VLLM_MODEL env var, or set explicitly:
	for msg, err := range vllmsdk.Query(
		ctx,
		vllmsdk.Text("Write a two-line haiku about Go concurrency."),
		// vllmsdk.WithModel("QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ"),
	) {
		if err != nil {
			panic(err)
		}

		if result, ok := msg.(*vllmsdk.ResultMessage); ok && result.Result != nil {
			fmt.Println(*result.Result)
		}
	}
}

Surface

  • Query(ctx, content, ...opts) and QueryStream(...) return iter.Seq2[Message, error].
  • NewClient() exposes Start, StartWithContent, StartWithStream, Query, ReceiveMessages, ReceiveResponse, Interrupt, SetPermissionMode, SetModel, ListModels, ListModelsResponse, GetMCPStatus, RewindFiles, and Close.
  • Unsupported peer-parity controls such as ReconnectMCPServer, ToggleMCPServer, StopTask, and SendToolResult are present on Client and return typed UnsupportedControlErrors.
  • UserMessageContent is the canonical input shape. Use Text(...) for text-only calls and Blocks(...) with ImageInput(...), FileInput(...), AudioInput(...), or VideoInput(...) for multimodal chat-completions requests.
  • WithSDKTools(...) registers high-level in-process tools under mcp__sdk__<name>.
  • WithOnUserInput(...) handles SDK-owned user-input prompts built on top of tool calling.
  • ListModels(...) and ListModelsResponse(...) use vLLM model discovery via /v1/models.
  • StatSession(...), ListSessions(...), and GetSessionMessages(...) operate on the SDK's local persisted session store.

Model Discovery

  • Discovery uses /v1/models.
  • Returned ModelInfo values are projected from the OpenAI-compatible model cards that vLLM serves, so provider-rich VLLM metadata is no longer guaranteed.
  • ModelInfo still exposes helper methods such as CostTier(), SupportsToolCalling(), SupportsStructuredOutput(), SupportsReasoning(), SupportsImageInput(), SupportsImageOutput(), SupportsWebSearch(), SupportsPromptCaching(), MaxContextLength(), and parsed pricing helpers.

Image Output

  • Generated images are surfaced as *ImageBlock values inside AssistantMessage.Content.
  • ImageBlock.Decode() returns raw bytes plus media type for data-URL-backed images.
  • ImageBlock.Save(path) writes generated images to disk.
  • Live image-generation coverage is available behind the integration build tag when VLLM_IMAGE_MODEL is set.

Multimodal Input

Multimodal input in this SDK is block-based and targets the vLLM OpenAI-compatible chat surface.

content := vllmsdk.Blocks(
	vllmsdk.TextInput("Compare these two screenshots and the attached spec file."),
	vllmsdk.ImageInput("https://example.com/before.png"),
	vllmsdk.ImageInput("data:image/png;base64,..."),
	vllmsdk.FileInput("spec.pdf", "data:application/pdf;base64,..."),
)

for msg, err := range vllmsdk.Query(ctx, content,
	// vllmsdk.WithModel("QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ"),
) {
	_ = msg
	_ = err
}
  • ImageInput(...) accepts a normal URL or a base64 data URL.
  • FileInput(...) accepts a filename plus file_data URL/data URL.
  • AudioInput(...) accepts base64 audio data plus a format.
  • VideoInput(...) accepts a normal URL or a data URL.
  • Responses mode is routed to the vLLM /v1/responses surface when selected.

Session Semantics

Session APIs are local SDK APIs, not remote vLLM server sessions.

  • They read from the SDK session store configured with WithSessionStorePath(...) or VLLM_AGENT_SESSION_STORE_PATH.
  • They do not derive from chat session_id.
  • They do not derive from Responses previous_response_id.

Unsupported Controls

vLLM does not have meaningful backend equivalents for some sibling control-plane methods. The SDK exposes those methods where peer parity matters, but they fail explicitly with UnsupportedControlError instead of faking semantics.

Examples

Runnable examples live under examples.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors