-
Notifications
You must be signed in to change notification settings - Fork 5
Testing
TitanX uses Vitest 4 as the test runner. Coverage target is ≥ 80%. CI enforces; local runs don't block.
bun run test # watch mode
bun run test:coverage # one-shot with coverage reportWatch mode is fast — Vitest picks up TypeScript changes and re-runs affected tests. Average feedback loop: 1-3 seconds.
For simple unit tests, co-locate .test.ts next to the module under test:
src/process/services/reasoningBank/
├── index.ts
└── index.test.ts
When a test spans multiple modules or exercises an integration path:
src/process/services/__tests__/
└── fleet-integration.test.ts
-
<module>.test.ts— unit tests for<module>.ts -
<module>.integration.test.ts— integration tests -
<feature>.e2e.test.ts— end-to-end tests (rare; mostly run viabun run test:e2e)
import { describe, it, expect, beforeEach } from 'vitest';
import { SqliteTestHarness } from '@process/services/database/__tests__/testHarness';
import { storeTrajectory, findSimilarTrajectories } from '../index';
describe('reasoningBank', () => {
let db: SqliteTestHarness;
beforeEach(async () => {
db = new SqliteTestHarness();
await db.init();
});
describe('storeTrajectory', () => {
it('stamps failure_pattern=1 when input.failurePattern is true', () => {
const id = storeTrajectory(db.driver, {
taskDescription: 'test task',
steps: [{ toolName: 'git.diff', args: {}, result: '...', durationMs: 0 }],
successScore: 0.3,
failurePattern: true,
});
const row = db.driver
.prepare('SELECT failure_pattern FROM reasoning_bank WHERE id = ?')
.get(id) as { failure_pattern: number };
expect(row.failure_pattern).toBe(1);
});
it('honors workspace_id isolation in retrieval', () => {
storeTrajectory(db.driver, {
taskDescription: 'project-a task',
steps: [{ toolName: 'x', args: {}, result: 'r', durationMs: 0 }],
successScore: 0.9,
workspaceId: 'ws-a',
});
storeTrajectory(db.driver, {
taskDescription: 'project-b task',
steps: [{ toolName: 'x', args: {}, result: 'r', durationMs: 0 }],
successScore: 0.9,
workspaceId: 'ws-b',
});
const results = findSimilarTrajectories(db.driver, 'project', 10, 'ws-a');
expect(results.map((r) => r.taskDescription)).toEqual([expect.stringContaining('project-a')]);
});
});
});Principles on display:
- Clear
describehierarchy - Each
ittests one behavior - Setup in
beforeEach - Arrange → Act → Assert structure
- Assertions are specific (
.toBe(1)not.toBeTruthy())
Pure logic, no side effects.
it('formats cost correctly', () => {
expect(formatCostCents(12345)).toBe('$123.45');
});Fast, deterministic, no mocks needed (usually).
Multi-module flows with a real SQLite test harness + mocked external services.
it('routes trajectory through push → ingest → dream → broadcast → apply', async () => {
const slaveDb = new SqliteTestHarness();
const masterDb = new SqliteTestHarness();
// ... set up slave + master, simulate enrollment ...
storeTrajectory(slaveDb.driver, { /* ... */ });
const envelope = buildLearningEnvelope(slaveDb.driver, 0, Date.now());
ingestLearningEnvelope(masterDb.driver, 'slave-a', envelope);
await runDreamPass(masterDb.driver);
// ... verify consolidated_learnings has the expected row ...
});Slower but high-value. Exercise realistic paths.
Playwright-driven. Launch the Electron app, click through UI, verify rendered state.
bun run test:e2eReserved for critical flows (first launch, team creation, agent hire). Most features don't need E2E.
Mock the edges of your unit (filesystem, network, IPC) and test the real internal logic.
// ✅ Mock the IPC bridge
vi.mock('@/common', () => ({
ipcBridge: { fleet: { getMode: { invoke: vi.fn(() => Promise.resolve('slave')) } } },
}));
// ❌ Don't mock the module you're testing
vi.mock('../index', () => ({ storeTrajectory: vi.fn() })); // defeats the pointIf you're mocking 5+ things, you're probably testing the mocks. Consider:
- Restructuring the code for testability (dependency injection)
- Using a test harness (real SQLite, fake filesystem) instead of mocks
- Testing at a higher level (integration, not unit)
Real in-memory SQLite database for tests that need persistence:
import { SqliteTestHarness } from '@process/services/database/__tests__/testHarness';
beforeEach(async () => {
db = new SqliteTestHarness();
await db.init(); // runs all migrations
});Used in filesystem-touching tests:
const fs = new MockFileSystem({ '/tmp/test.txt': 'hello' });For renderer tests that need IPC responses:
const ipc = new FakeIpcBridge();
ipc.register('fleet.getMode', () => 'master');≥ 80% across src/process/services/, src/common/, src/renderer/hooks/.
Pages and components have looser targets (UI surface is mostly tested via E2E, plus visual inspection).
bun run test:coverage
# Opens coverage/index.htmlDrill into any file to see which lines are hit / missed. Unmet lines are usually:
- Edge-case branches — add a test
- Error paths — add a test that triggers the error
- Dead code — remove it
Rarely needed. When it is, inline:
/* c8 ignore next 5 */
if (process.env.NODE_ENV === 'production') {
// production-only path, not testable in unit tests
preloadMetrics();
}# By file
bun run test reasoningBank
# By describe/it name (pattern)
bun run test -t 'honors workspace_id'
# Watch a specific file only
bun run test --watch reasoningBank/index.test.tsGitHub Actions runs on every PR:
bun install-
bunx tsc --noEmit— fails on type errors -
bun run lint— fails on errors (warnings OK) -
bun run test:coverage— fails if coverage drops below 80% -
bun run i18n:types && node scripts/check-i18n.js— fails on locale drift -
prek run— CI equivalent of local prek check
All 6 must pass to merge. Skipping tests in a PR requires a human reviewer's explicit approval in the PR thread.
// ❌ Tests the inner mechanism
expect(spy).toHaveBeenCalledWith({ step1: ..., step2: ... });
// ✅ Tests the outcome
expect(result.finalState).toEqual(expected);Refactors shouldn't break tests. If they do, you tested the implementation, not the behavior.
Tests own their cleanup. beforeEach for setup; afterEach for teardown if needed. No test should leave state that affects the next.
Zero tolerance. If a test is flaky, it's broken. Fix it or delete it — don't retry in CI to mask.
Common causes: time-based logic (use faketimers), network (mock), process race conditions (use await).
it('migration v73 adds source_tag to agent_memory', async () => {
const db = new SqliteTestHarness({ migrateToVersion: 72 });
await db.init();
// confirm column doesn't exist yet
expect(() => db.driver.prepare('SELECT source_tag FROM agent_memory').get()).toThrow();
await db.migrateTo(73);
// column exists, rows readable
db.driver.prepare(
'INSERT INTO agent_memory (id, agent_slot_id, ..., source_tag) VALUES (?, ?, ..., ?)'
).run('1', 'slot-x', ..., 'fleet_consolidated');
const row = db.driver.prepare('SELECT source_tag FROM agent_memory WHERE id = ?').get('1');
expect(row.source_tag).toBe('fleet_consolidated');
});it('rejects agent.execute from workforce-role slave with not_farm_role', async () => {
const slaveDb = await setupSlaveInRole('workforce');
const envelope = buildSignedEnvelope({
commandType: 'agent.execute',
targetDeviceId: 'slave-a',
params: { jobId: 'j1', slotId: 's1', messages: [], model: 'claude', timeoutMs: 10000 },
});
const result = await executeFleetCommand(slaveDb.driver, envelope);
expect(result).toEqual({
ok: false,
reason: 'not_farm_role',
});
});it('drops trajectory containing sk-ant- API key', async () => {
const slaveDb = new SqliteTestHarness();
await slaveDb.init();
storeTrajectory(slaveDb.driver, {
taskDescription: 'Call API with key sk-ant-api03-some-secret-value',
steps: [{ toolName: 'web.fetch', args: {}, result: 'ok', durationMs: 0 }],
successScore: 0.9,
});
const envelope = buildLearningEnvelope(slaveDb.driver, 0, Date.now());
// trajectory should have been dropped by entropy audit
expect(envelope?.trajectories).toEqual([]);
});- Development Setup — test commands
- Code Conventions — style rules
- Pull Request Workflow — how tests gate merges
-
.claude/skills/testing/SKILL.md— canonical skill
TitanX · Enterprise AI Agent Orchestration · Apache-2.0
Docs: Wiki · Technical docs · Releases · Security
Last updated for v2.5.1 — report doc issue · contribute to the wiki
📖 Getting Started
🧩 Core Concepts
- Architecture Overview
- Agents and Teams
- Agent Gallery and Templates
- ACP Runtimes
- MCP Servers
- Workspaces
- Reasoning Bank
👤 End-User Guides
- Hiring Agents from the Gallery
- The Sprint Board
- Conversations and Chat UI
- Using Custom Assistants
- Skills Hub
- Cron and Scheduled Tasks
- Observability
- Caveman Mode
🌐 Fleet Mode
- Fleet Mode Overview
- Master Setup Guide
- Slave Enrollment
- Agent Farm Setup
- Publishing Agent Templates
- Command Center
- Device Forensics and Revocation
🌙 Dream Mode
- Dream Mode Overview
- Enabling Dream Mode
- Dream Pass Internals
- Consolidated Learnings Dashboard
- Privacy and Redaction
🔒 Security
- Security Model
- IAM Policies
- Audit Logging
- Device Identity and Signing
- Secrets Management
- Compliance and Data Residency
🛠 Developer
- Development Setup
- Project Structure
- Code Conventions
- Testing
- Adding an ACP Runtime
- Adding an MCP Server
- Pull Request Workflow
📘 Reference
- Configuration Keys
- Environment Variables
- IPC Channels
- Database Schema
- Fleet Command Types
- Telemetry Shape
- CLI and Keyboard Shortcuts
❓ Help
🔗 Outside the wiki
v2.5.1 · 50+ pages · Contribute