Skip to content

Commit b7b3649

Browse files
committed
feat(qa): gate pipeline and wire auto QA creation
1 parent 735693d commit b7b3649

27 files changed

Lines changed: 405 additions & 67 deletions

.docker/docker-entrypoint.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
# pass over a host projects directory can make startup appear hung.
55
if [ "$(id -u)" = "0" ]; then
66
install -d -o node -g node /data /home/node/.claude /home/node/.codex 2>/dev/null || true
7-
chown -R node:node /data /home/node/.claude /home/node/.codex 2>/dev/null || true
7+
chown -R node:node /data /home/node/.claude /home/node/.codex /home/node/.npm 2>/dev/null || true
88
if [ -e /home/node/.claude.json ]; then
99
chown node:node /home/node/.claude.json 2>/dev/null || true
1010
fi

.env.example

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -242,6 +242,14 @@ LOG_LEVEL=debug
242242
# Enable only for runtimes/transports that advertise session fork support.
243243
# AIF_WARMUP_ENABLED=false
244244

245+
# ----------------------------------------------------------
246+
# QA pipeline feature toggle (master switch)
247+
# ----------------------------------------------------------
248+
# Default false. When false the manual POST /tasks/:id/run-qa endpoint returns
249+
# 403 and the autoQa auto-trigger on approve_done is skipped. Enable to run the
250+
# /aif-qa --all pipeline (change-summary -> test-plan -> test-cases) for tasks.
251+
# AIF_QA_PIPELINE_ENABLED=false
252+
245253
# ----------------------------------------------------------
246254
# Codex OAuth login in Docker (dev-only broker)
247255
# ----------------------------------------------------------

docs/api.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -761,7 +761,9 @@ runner generates three markdown artifacts under `<paths.qa>/<branch-slug>/`
761761
(`change-summary.md`, `test-plan.md`, `test-cases.md`), persists them onto the task
762762
(`qaChangeSummary`, `qaTestPlan`, `qaTestCases`), and updates `qaStatus`. Execution
763763
uses the task's worktree root when present (`worktreePath`), otherwise the project
764-
root. The same pipeline runs automatically after `approve_done` when `autoQa = true`.
764+
root. Artifact slugs use the task's persisted `branchName` when present; branchless
765+
tasks fall back to the current git branch in the execution root. The same pipeline
766+
runs automatically after `approve_done` when `autoQa = true`.
765767

766768
**Response:** `202 Accepted`
767769

@@ -771,10 +773,10 @@ root. The same pipeline runs automatically after `approve_done` when `autoQa = t
771773

772774
**Errors:**
773775

776+
- `403` — QA pipeline feature flag is disabled (`AIF_QA_PIPELINE_ENABLED=false`); body carries `code: "feature_disabled"`
774777
- `404` — task not found
775778
- `404` — project not found
776779
- `409` — QA is already running (`qaStatus === "running"`)
777-
- `409` — task has no `branchName` (required to compute the artifact slug)
778780

779781
**WebSocket events:** `task:qa_started` immediately, then `task:qa_done` or
780782
`task:qa_failed` when the run finishes. The runner also broadcasts `task:updated`

docs/configuration.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ Node packages (`@aif/api`, `@aif/agent`, `@aif/data`, `@aif/shared`) auto-load e
6565
| `AGENT_CHAT_MAX_TURNS` | number | `50` | Maximum turns (tool calls) per chat session before the runtime terminates. Increase for complex multi-file tasks |
6666
| `AIF_USAGE_LIMITS_ENABLED` | boolean | `false` | Master switch for the usage-limits feature. When `false` (default): `/runtime-profiles` skips Codex indexed-head overlay refresh and the Claude provider-identity lookup; the runtime service skips `observeRuntimeLimitEvent` / `extractLatestRuntimeLimitSnapshot` / `extractRuntimeLimitSnapshotFromError` during every runtime run; the WebSocket `project:runtime_limit_updated` broadcast is suppressed; the agent's stage error handler does not persist `limitSnapshot` onto tasks; the chat API omits `runtimeLimitSnapshot` from responses; and the web UI hides all 5 usage-limit surfaces (USAGE button in Header, TaskCard badge, TaskDetailHeader badge, ProjectRuntimeSettings "Recent Limit Signals", Chat active-limit banner + `CHAT_USAGE_LIMIT` fallback). Persisted snapshots already in the DB are still returned on read but never updated. When `true`, Codex overlays and broadcasts are fed from the background SQLite index (`codex_limit_heads`/`codex_limit_history`) instead of request-path filesystem scans. Set to `true` only when you actively monitor rate-limit windows |
6767
| `AIF_WARMUP_ENABLED` | boolean | `false` | Master switch for project runtime warmup. When `false`, `/projects/:id/warmup` reports `enabled=false`, create requests return `403`, and the web UI hides Warmup entry points. When `true`, supported planner, implementer, and review runtimes can create time-limited seed sessions that later matching stage runs may fork. Warmup lifecycle logs use the `projects-route`, `api-runtime`, and runtime adapter log components and include project/runtime ids, TTL, expiry, and status without prompt text or secrets |
68+
| `AIF_QA_PIPELINE_ENABLED` | boolean | `false` | Master switch for the task QA pipeline. When `false` (default): the manual `POST /tasks/:id/run-qa` endpoint returns `403` with `code: "feature_disabled"`, and the `autoQa` auto-trigger on `approve_done` is skipped. When `true`, manual runs and `autoQa` tasks execute the `/aif-qa --all` pipeline (change-summary → test-plan → test-cases) via `runApiRuntimeOneShot` and persist the three artifacts on the task |
6869
| `AIF_ENABLE_CODEX_LOGIN_PROXY` | boolean | `false` | Enable the in-container Codex OAuth login broker and the api-side `/auth/codex/*` proxy. Dev-only. In production prefer `OPENAI_API_KEY`. See [Providers](providers.md#codex-oauth-login-in-docker-broker) |
6970
| `AIF_CODEX_LOGIN_BROKER_PORT` | number | `3010` | Port the Codex login broker binds inside the agent container (not mapped to the host by the dev compose) |
7071
| `AGENT_INTERNAL_URL` | string | `http://agent:3010` | Base URL the api uses to reach the agent-side Codex login broker over the docker network |

packages/agent/src/__tests__/hooks.test.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@ function makeEnv(overrides: Record<string, unknown> = {}) {
7070
AIF_USAGE_LIMITS_ENABLED: false,
7171
AIF_STAGE_RUNTIME_PIN_ENABLED: false,
7272
AIF_WARMUP_ENABLED: false,
73+
AIF_QA_PIPELINE_ENABLED: false,
7374
AIF_RUNTIME_CODEX_NATIVE_SUBAGENTS_ENABLED: false,
7475
AIF_TASK_WORKTREES_ENABLED: false,
7576
AIF_RUNTIME_SESSION_FORK_ENABLED: false,

packages/api/src/__tests__/qaRunner.test.ts

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -91,12 +91,15 @@ describe("runQaQuery", () => {
9191
expect(mockRunApiRuntimeOneShot).not.toHaveBeenCalled();
9292
});
9393

94-
it("fails loud when task has no branchName (does not guess via git)", async () => {
94+
it("falls back to the current git branch when task has no branchName", async () => {
9595
mockFindTaskById.mockReturnValue({ id: "t1", branchName: null, qaStatus: "idle" });
96+
// executionRoot is a plain tmpdir (no git work tree), so the skill-mirrored
97+
// `git branch --show-current` fallback resolves to "" → the "branch" slug.
98+
const slug = computeQaBranchSlug("", root);
99+
writeArtifacts(join(root, ".ai-factory/qa", slug));
96100
const res = await runQaQuery({ projectId: "p1", taskId: "t1", executionRoot: root });
97-
expect(res.ok).toBe(false);
98-
expect(res.error).toMatch(/branchName/);
99-
expect(mockRunApiRuntimeOneShot).not.toHaveBeenCalled();
101+
expect(res.ok).toBe(true);
102+
expect(mockRunApiRuntimeOneShot).toHaveBeenCalled();
100103
});
101104

102105
it("calls runtime with qa workflow contract", async () => {

packages/api/src/__tests__/runQaRoute.test.ts

Lines changed: 30 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,13 @@
11
import { describe, it, expect, beforeEach, vi } from "vitest";
22
import { Hono } from "hono";
3-
import { projects, tasks } from "@aif/shared";
3+
import { projects, tasks, resetEnvCache } from "@aif/shared";
44
import { createTestDb } from "@aif/shared/server";
55

6+
// QA routes are gated behind AIF_QA_PIPELINE_ENABLED (off by default). Enable it
7+
// before importing the route module — schemas.ts calls getEnv() at schema-definition
8+
// time, which caches the parsed env. The disabled-flag case below toggles it back.
9+
process.env.AIF_QA_PIPELINE_ENABLED = "true";
10+
611
const testDb = { current: createTestDb() };
712

813
vi.mock("@aif/shared/server", async (importOriginal) => {
@@ -79,11 +84,16 @@ describe("POST /tasks/:id/run-qa", () => {
7984
expect(mockRunQaQuery).not.toHaveBeenCalled();
8085
});
8186

82-
it("returns 409 when task has no branchName", async () => {
87+
it("returns 202 for a branchless task (runner resolves the branch via git)", async () => {
8388
seedTask({ branchName: null });
8489
const res = await app.request("/tasks/t1/run-qa", { method: "POST" });
85-
expect(res.status).toBe(409);
86-
expect(mockRunQaQuery).not.toHaveBeenCalled();
90+
expect(res.status).toBe(202);
91+
await new Promise((r) => setTimeout(r, 0));
92+
expect(mockRunQaQuery).toHaveBeenCalledWith({
93+
projectId: "p1",
94+
taskId: "t1",
95+
executionRoot: "/tmp/p1",
96+
});
8797
});
8898

8999
it("returns 202 and triggers the runner on a valid request", async () => {
@@ -110,4 +120,20 @@ describe("POST /tasks/:id/run-qa", () => {
110120
executionRoot: "/tmp/wt/t1",
111121
});
112122
});
123+
124+
it("returns 403 with feature_disabled when AIF_QA_PIPELINE_ENABLED is off", async () => {
125+
seedTask();
126+
process.env.AIF_QA_PIPELINE_ENABLED = "false";
127+
resetEnvCache();
128+
try {
129+
const res = await app.request("/tasks/t1/run-qa", { method: "POST" });
130+
expect(res.status).toBe(403);
131+
const body = (await res.json()) as { code?: string };
132+
expect(body.code).toBe("feature_disabled");
133+
expect(mockRunQaQuery).not.toHaveBeenCalled();
134+
} finally {
135+
process.env.AIF_QA_PIPELINE_ENABLED = "true";
136+
resetEnvCache();
137+
}
138+
});
113139
});

packages/api/src/__tests__/settings.test.ts

Lines changed: 31 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
1-
import { beforeEach, describe, expect, it, vi } from "vitest";
1+
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
22
import { Hono } from "hono";
33
import { mkdirSync, mkdtempSync, writeFileSync, rmSync, readFileSync } from "node:fs";
44
import { join } from "node:path";
55
import { tmpdir } from "node:os";
6+
import { resetEnvCache } from "@aif/shared";
67

78
// Create a temp dir to act as monorepo root and home
89
const tempRoot = mkdtempSync(join(tmpdir(), "settings-test-"));
@@ -107,6 +108,7 @@ vi.mock("@aif/data", () => ({
107108
isRuntimeProfileEligibleForAppDefaults: (runtimeProfileId: string | null) =>
108109
runtimeProfileId == null || eligibleAppDefaultProfileIds.has(runtimeProfileId),
109110
createDbUsageSink: () => ({ record: vi.fn() }),
111+
listRuntimeProfiles: () => [],
110112
}));
111113

112114
vi.mock("node:os", async (importOriginal) => {
@@ -117,7 +119,7 @@ vi.mock("node:os", async (importOriginal) => {
117119
};
118120
});
119121

120-
const { settingsRoutes } = await import("../routes/settings.js");
122+
const { settingsRoutes, buildSettingsOverview } = await import("../routes/settings.js");
121123

122124
function createApp() {
123125
const app = new Hono();
@@ -514,3 +516,30 @@ describe("settings API — config routes", () => {
514516
});
515517
});
516518
});
519+
520+
describe("settings overview — QA pipeline flag", () => {
521+
const original = process.env.AIF_QA_PIPELINE_ENABLED;
522+
523+
afterEach(() => {
524+
if (original === undefined) {
525+
delete process.env.AIF_QA_PIPELINE_ENABLED;
526+
} else {
527+
process.env.AIF_QA_PIPELINE_ENABLED = original;
528+
}
529+
resetEnvCache();
530+
});
531+
532+
it("reports qaPipelineEnabled: true when AIF_QA_PIPELINE_ENABLED is on", async () => {
533+
process.env.AIF_QA_PIPELINE_ENABLED = "true";
534+
resetEnvCache();
535+
const overview = await buildSettingsOverview();
536+
expect(overview.qaPipelineEnabled).toBe(true);
537+
});
538+
539+
it("reports qaPipelineEnabled: false when AIF_QA_PIPELINE_ENABLED is off", async () => {
540+
process.env.AIF_QA_PIPELINE_ENABLED = "false";
541+
resetEnvCache();
542+
const overview = await buildSettingsOverview();
543+
expect(overview.qaPipelineEnabled).toBe(false);
544+
});
545+
});

packages/api/src/__tests__/tasks.test.ts

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -460,6 +460,37 @@ describe("tasks API", () => {
460460
expect(body.useSubagents).toBe(false);
461461
});
462462

463+
it("should persist autoQa from create payload", async () => {
464+
const res = await app.request("/tasks", {
465+
method: "POST",
466+
headers: { "Content-Type": "application/json" },
467+
body: JSON.stringify({
468+
title: "Task with auto QA",
469+
projectId: "test-project",
470+
autoQa: true,
471+
}),
472+
});
473+
474+
expect(res.status).toBe(201);
475+
const body = await res.json();
476+
expect(body.autoQa).toBe(true);
477+
});
478+
479+
it("should default autoQa to false when omitted on create", async () => {
480+
const res = await app.request("/tasks", {
481+
method: "POST",
482+
headers: { "Content-Type": "application/json" },
483+
body: JSON.stringify({
484+
title: "Task without auto QA",
485+
projectId: "test-project",
486+
}),
487+
});
488+
489+
expect(res.status).toBe(201);
490+
const body = await res.json();
491+
expect(body.autoQa).toBe(false);
492+
});
493+
463494
it("should default useSubagents to AGENT_USE_SUBAGENTS env value", async () => {
464495
const { getEnv } = await import("@aif/shared");
465496
const envDefault = getEnv().AGENT_USE_SUBAGENTS;

packages/api/src/routes/settings.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,7 @@ export async function buildSettingsOverview() {
9292
autoReviewStrategy: env.AGENT_AUTO_REVIEW_STRATEGY,
9393
usageLimitsEnabled: env.AIF_USAGE_LIMITS_ENABLED,
9494
warmupEnabled: env.AIF_WARMUP_ENABLED,
95+
qaPipelineEnabled: env.AIF_QA_PIPELINE_ENABLED,
9596
runtimeReadiness: {
9697
availableRuntimeCount: registry.listRuntimes().length,
9798
runtimeProfileCount: runtimeProfiles.length,
@@ -114,6 +115,7 @@ export async function buildSettingsOverview() {
114115
autoReviewStrategy: env.AGENT_AUTO_REVIEW_STRATEGY,
115116
usageLimitsEnabled: env.AIF_USAGE_LIMITS_ENABLED,
116117
warmupEnabled: env.AIF_WARMUP_ENABLED,
118+
qaPipelineEnabled: env.AIF_QA_PIPELINE_ENABLED,
117119
runtimeReadiness: {
118120
availableRuntimeCount: 0,
119121
runtimeProfileCount: allProfiles.length,

0 commit comments

Comments
 (0)