Complete guide to testing the Windmill MCP Server.
- Overview
- Test Strategy
- Quick Start
- Unit Testing
- Integration Testing
- E2E Testing
- CI/CD Integration
- Troubleshooting
This project uses Vitest as the test framework with three levels of testing:
- Unit Tests: Fast, isolated tests with mocks (no external dependencies)
- Integration Tests: Component integration tests
- E2E Tests: Full end-to-end tests against real Windmill instance
| Test Type | Use When | Speed | External Deps | Run Frequency |
|---|---|---|---|---|
| Unit | Testing logic, functions, utilities | ⚡ Fast | ❌ None | Every commit |
| Integration | Testing component interactions | 🏃 Medium | Before merge | |
| E2E | Testing complete workflows | 🐌 Slow | ✅ Real Windmill | Before release |
/\
/E2E\ Few E2E tests (critical paths)
/------\
/ Int \ Some integration tests
/--------\
/ Unit \ Many unit tests (fast feedback)
------------
npm test# Unit tests only (fast)
npm run test:unit
# E2E tests (requires Windmill)
npm run test:e2e
# Full E2E suite with setup
npm run test:e2e:full# Run tests on file changes
npm run test:watch
# Interactive UI
npm run test:uinpm run test:coverageUnit tests are fast and don't require external services.
Use the MockWindmillClient from tests/utils/mocks.js:
import { describe, it, expect } from "vitest";
import { MockWindmillClient, mockJob } from "../utils/mocks.js";
describe("Job Handler", () => {
it("should process job data", async () => {
const client = new MockWindmillClient();
const jobs = await client.listJobs("workspace");
expect(jobs).toBeInstanceOf(Array);
expect(jobs[0]).toMatchObject({
id: expect.any(String),
status: "completed",
});
});
});MockWindmillClient- Complete mock Windmill clientmockJob,mockScript,mockWorkflow- Pre-defined test datacreateMockToolRequest,createMockToolResponse- MCP helpersmockFetch- HTTP fetch mocker
# All unit tests
npm run test:unit
# Specific file
npx vitest tests/unit/mocks.test.js
# Watch mode
npx vitest tests/unit --watchIntegration tests verify component interactions.
import { describe, it, expect, beforeEach } from "vitest";
describe("MCP Tool Integration", () => {
let server;
beforeEach(() => {
// Setup MCP server with mocked client
server = createTestServer();
});
it("should handle list_jobs tool call", async () => {
const request = {
method: "tools/call",
params: {
name: "list_jobs",
arguments: { workspace: "test" },
},
};
const response = await server.handleRequest(request);
expect(response).toHaveProperty("content");
expect(response.isError).toBe(false);
});
});E2E tests run against a real Windmill instance using Docker.
- Direct API Tests (
windmill-api.test.js) - Test Windmill API directly - MCP Integration Tests (
mcp-integration.test.js) - Test via MCP protocol
The MCP integration tests verify the complete workflow:
MCP Client → MCP Server → Windmill API → Response
These tests:
- Use the auto-generated MCP server (490 tools from OpenAPI spec)
- Call tools via MCP protocol and validate actual data from Windmill
- Verify response structures (jobs, scripts, workflows, users, workspaces, resources)
- Test tool discovery (all generated tools are available)
- Validate error handling for invalid tools and arguments
- Query real data from Windmill instance to ensure integration works
Test Coverage:
- Version information retrieval
- Job listing and querying specific jobs
- Script listing with structure validation
- Workflow/flow listing with structure validation
- User information queries
- Workspace listing
- Resource listing
- Error handling for invalid requests
For local development and testing, use the superadmin secret from Docker:
# Start Windmill and wait for it to be ready
npm run docker:dev
# This will output:
# ✅ Windmill ready at http://localhost:8000
# Superadmin secret: test-super-secret
# Default workspace: adminsThen set up your environment:
# Option 1: Use superadmin secret (simplest for development)
export E2E_WINDMILL_URL=http://localhost:8000
export E2E_WINDMILL_TOKEN=test-super-secret
export E2E_WORKSPACE=admins
npm run test:e2e
# Option 2: Create a user token (recommended for production-like testing)
# 1. Access http://localhost:8000
# 2. Login/create account
# 3. User Settings → Tokens → Create token
# 4. Add to `.env`:
E2E_WINDMILL_URL=http://localhost:8000
E2E_WINDMILL_TOKEN=your-user-token
E2E_WORKSPACE=adminsSet up everything for development:
# 1. Start Windmill
npm run docker:dev
# 2. Generate the MCP server
npm run generate
# 3. In another terminal, run the MCP server
npm run dev
# The MCP server will be connected to your local Windmill instance- Start Windmill:
npm run docker:up- Wait for startup (30-60 seconds):
npm run docker:wait-
Get API Token:
For Development (easiest):
- Use the superadmin secret:
test-super-secret - No need to create an account or login
For Production-like Testing:
- Access http://localhost:8000
- Login/create account
- User Settings → Tokens → Create token
- Add to
.env:
E2E_WINDMILL_URL=http://localhost:8000 E2E_WINDMILL_TOKEN=your-token-here E2E_WORKSPACE=admins
- Use the superadmin secret:
-
Run E2E tests:
npm run test:e2e- Cleanup:
# Stop containers but keep data
npm run docker:down
# Stop containers and remove all data (clean slate)
npm run docker:cleanRun everything automatically:
npm run test:e2e:fullThis starts Windmill, waits for it to be ready, runs tests, and cleans up.
import { describe, it, expect, beforeAll } from "vitest";
const isE2EEnabled =
process.env.E2E_WINDMILL_URL && process.env.E2E_WINDMILL_TOKEN;
describe.skipIf(!isE2EEnabled)("Job Execution E2E", () => {
const baseUrl = process.env.E2E_WINDMILL_URL;
const token = process.env.E2E_WINDMILL_TOKEN;
const workspace = process.env.E2E_WORKSPACE;
beforeAll(async () => {
// Verify Windmill is accessible
const response = await fetch(`${baseUrl}/api/version`);
expect(response.ok).toBe(true);
});
it("should execute a simple script", async () => {
// Create and run a script
const response = await fetch(
`${baseUrl}/api/w/${workspace}/jobs/run/inline`,
{
method: "POST",
headers: {
Authorization: `Bearer ${token}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
language: "python3",
content: 'print("Hello from E2E test")',
}),
},
);
expect(response.ok).toBe(true);
const job = await response.json();
expect(job).toHaveProperty("id");
});
});- ✅ Use
skipIfto skip when Windmill not available - ✅ Clean up resources after tests
- ✅ Use unique identifiers to avoid conflicts
- ✅ Test critical paths only (E2E tests are slow)
- ❌ Don't test every edge case (use unit tests)
- ❌ Don't hardcode IDs (resources may not exist)
name: Tests
on: [push, pull_request]
jobs:
unit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: "18"
- run: npm ci
- run: npm run test:unit
e2e:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: "18"
- run: npm ci
- name: Start Windmill
run: npm run docker:up
- name: Wait for Windmill
run: npm run docker:wait
- name: Run E2E Tests
run: npm run test:e2e
env:
E2E_WINDMILL_URL: http://localhost:8000
E2E_WINDMILL_TOKEN: ${{ secrets.E2E_TOKEN }}
- name: Stop Windmill
if: always()
run: npm run docker:downProblem: Cannot find module 'vitest'
Solution: Install dependencies:
npm installProblem: All E2E tests show as skipped
Solution: Set environment variables:
export E2E_WINDMILL_URL=http://localhost:8000
export E2E_WINDMILL_TOKEN=your-token
npm run test:e2eProblem: Error: port 8000 already in use
Solution: Stop the conflicting service:
lsof -i :8000
# Kill the process or use different portProblem: Tests fail with timeout errors
Solution: Increase timeout in vitest.config.js:
testTimeout: 60000, // 60 secondsSee tests/docker/README.md for Docker-specific troubleshooting.
npm run test:coverageCoverage reports are generated in coverage/:
coverage/index.html- HTML report (open in browser)coverage/coverage-final.json- JSON report- Console output during test run
- Unit tests: > 80% coverage
- Integration tests: Critical paths covered
- E2E tests: Happy paths and key workflows
- Keep tests independent - No shared state between tests
- Use descriptive names - Test names should explain what they verify
- Test behavior, not implementation - Focus on what, not how
- One assertion per test - Or at least one logical assertion
- Fast feedback - Keep unit tests fast
- Mock external dependencies - Don't hit real APIs in unit tests
- Mock at boundaries - Mock HTTP, not internal functions
- Verify mock interactions - Check that mocks were called correctly
- Reset mocks - Clean up between tests
- Test critical paths - Don't test everything
- Clean up resources - Delete test data after tests
- Use test-specific data - Don't interfere with real data
- Be resilient - Handle flaky network, timing issues