forked from thesongzhu/Friday
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathOVERNIGHT-TASK-SUMMARY.csv
More file actions
We can make this file beautiful and searchable if this error is corrected: It looks like row 49 should actually have 13 columns, instead of 14 in line 48.
55 lines (55 loc) · 30.9 KB
/
Copy pathOVERNIGHT-TASK-SUMMARY.csv
File metadata and controls
55 lines (55 loc) · 30.9 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
"id","category","title","status","priority","description","problem_observed","fix_applied","fix_commit_or_pr","evidence_files","source_files","spec_doc_section","next_action"
1,"DeepSeek closed-loop","DeepSeek API path: openai-responses → openai-completions","DONE","blocker","DeepSeek catalog preset routed chat through /v1/responses; api.deepseek.com only ships /v1/chat/completions","Friday's capability doctor returned 404 on text probes; chat phases SKIPped because text capability was never marked verified. Discovery via FULL gauntlet log showing dist stale.","Changed src/providers/model/friday-provider-catalog.ts deepseek preset from openai-responses to openai-completions; rebuilt dist after pulling.","PR #162 (merged 7e0e49df)","tests-overnight phase markers; .friday/health/gauntlet-fast.log","src/providers/model/friday-provider-catalog.ts; test/unit/providers/model/friday-provider-templates.test.ts; test/integration/hub/friday-hub-bootstrap-integration.test.ts",".friday/health/new-phases-spec.md §6 'DeepSeek api path'","none — verified via FULL gauntlet 27 phases PASS"
2,"DeepSeek closed-loop","Capability probe max_tokens 16 → 256","DONE","blocker","Reasoning-first models (deepseek-v4-pro/flash) burned full 16-token budget on reasoning_content, returning empty content; capability doctor marked text probe failed.","Probe sent {messages:[...]} with max_tokens:16; response had reasoning_content + empty content + finish_reason=length; OK marker never produced.","Bumped openai-completions probe to max_tokens:256 and openai-responses probe to max_output_tokens:256 in src/providers/services/friday-provider-service.ts","PR #162 (merged 7e0e49df)","provider_profiles table runtimeCapabilities entries; .friday/health/gauntlet-fast.log","src/providers/services/friday-provider-service.ts L1546-1623",".friday/health/new-phases-spec.md §6 'capability probe budget'","none"
3,"DeepSeek closed-loop","Gauntlet provider verification step (validate + capability doctor)","DONE","blocker","Auto-detect persisted providers with validation.status=never AND empty runtimeCapabilities; chat phases SKIPped with 'no provider/model route satisfies required capabilities: text'.","verifyAutoDetectedProviders only validated auth (sets status=ok) but capability filter required runtimeCapabilities[capability=text].status=verified — second call needed.","Added verifyAutoDetectedProviders helper in tests-overnight/lib/util.mjs that calls /v1/providers/:id/validate then POST /v1/capabilities/doctor; gauntlet.mjs invokes after login.","PR #162 (merged 7e0e49df)","wave run logs show 'verify OK kind=deepseek' + 'capability doctor ran probes=N verified=...'","tests-overnight/lib/util.mjs; tests-overnight/gauntlet.mjs",".friday/health/new-phases-spec.md §6 'gauntlet provider verification'","none"
4,"DeepSeek closed-loop","Gauntlet STATE_DIR reset on orchestrator start","DONE","blocker","Stale OpenAI/Anthropic providers from earlier runs survived in friday.db across runs and shadowed DeepSeek-only auto-detect.","Phase R created 30+ rl-probe openai providers; FULL run 1 saw verify=[] empty + chat phases SKIP.","Added rmSync(STATE_DIR) + mkdirSync(STATE_DIR) right before bootFriday() in tests-overnight/gauntlet.mjs.","PR #162 (merged 7e0e49df)","sqlite3 /tmp/friday-overnight-test/state/friday.db on each run shows clean baseline","tests-overnight/gauntlet.mjs L50-56",".friday/health/new-phases-spec.md §6 'gauntlet state reset'","none"
5,"DeepSeek closed-loop","Gauntlet login bootstrap unconditional","DONE","blocker","login() helper used /v1/auth/bootstrap/status to gate the bootstrap call; in dev mode bootstrapRequired returns false even when local user has no password_hash, so login then 401'd with NO_PASSWORD_CONFIGURED on a freshly reset state-dir.","FAST gauntlet attempt #3 failed at login.","Removed status gate; always POST /v1/auth/bootstrap/local-passphrase and treat 409 (AUTH_BOOTSTRAP_ALREADY_DONE) as benign. Same fix in tests-overnight/phases/phase-X-ui-wizard.mjs.","PR #162 (merged 7e0e49df)","wave run logs show successful login on fresh state","tests-overnight/lib/util.mjs login(); tests-overnight/phases/phase-X-ui-wizard.mjs",".friday/health/new-phases-spec.md §6 'gauntlet bootstrap fix'","none"
6,"DeepSeek closed-loop","DeepSeek live verification (FAST gauntlet env-only key)","DONE","blocker","Run a fresh FAST gauntlet with only DEEPSEEK_API_KEY env-injected, expect every chat-dependent phase to PASS using real DeepSeek calls.","none","FAST gauntlet run #7 STABILITY GAUNTLET COMPLETE; B/E/F/H/I/P all PASS via DeepSeek.","PR #162 (merged 7e0e49df)",".friday/health/gauntlet-fast.log; /tmp/friday-overnight-test/markers/*.complete.json","tests-overnight/gauntlet.mjs",".friday/health/new-phases-spec.md §6","none"
7,"DeepSeek closed-loop","Phase M skill import recheck after fixture change","DONE","P0","Phase M fixture had been migrated from manifest.json/node to skill.manifest.json/shell after the last completed gauntlet run.","none","Phase M PASS in both FAST and FULL gauntlet runs.","PR #162 verification","/tmp/friday-overnight-test/markers/M.complete.json","tests-overnight/phases/phase-M-skill-import-rollback.mjs",".friday/health/new-phases-spec.md §1 Phase M","none"
8,"DeepSeek closed-loop","FULL overnight gauntlet (non-FAST) end-to-end","DONE","blocker","Fast mode skips/compresses some checks; FULL is the stability proof.","none","FULL gauntlet ran 4h35m; STABILITY GAUNTLET COMPLETE; ALL 27 phases A..X PASS; evidenceSha256 2b19749e38edaa819eb9fb5379873a580baf9a5582621bff1f5b372cf101d15f; monitorsOk processOk+dbOk true; missingPhases none.","PR #162 verification",".friday/health/gauntlet-full.log; /tmp/friday-overnight-test/STABILITY-FINDINGS-OVERNIGHT.md; /tmp/friday-overnight-test/markers/","tests-overnight/gauntlet.mjs","Top of full gauntlet log","none"
9,"DeepSeek closed-loop","Phase W token expiry real wait (PHASE_W_WAIT_S=3650s)","DONE","P0","FAST mode marks Phase W SKIP because TTL hardcoded 3600s requires >=3601s wait.","none","Phase W PASS in FULL with 3650s real wait; original token returned 401, refresh issued new token, /v1/auth/me 200 with new token.","PR #162 FULL gauntlet","/tmp/friday-overnight-test/markers/W.complete.json","tests-overnight/phases/phase-W-token-expiry.mjs",".friday/health/new-phases-spec.md §1 Phase W","none"
10,"DeepSeek closed-loop","Rate-limit full accuracy (Phase R full timing)","DONE","P0","FAST_MODE skips 70s reset waits between policies.","none","Phase R PASS in FULL gauntlet across 22 rate-limit policies with full reset cycles.","PR #162 FULL gauntlet","/tmp/friday-overnight-test/markers/R.complete.json","tests-overnight/phases/phase-R-rate-limit.mjs",".friday/health/new-phases-spec.md §1 Phase R","none"
11,"DeepSeek closed-loop","DeepSeek unit + integration test coverage","DONE","P0","DeepSeek support added during stop condition with build/gauntlet syntax checks only — no dedicated unit tests.","none","Added: provider-templates.test.ts assertion for V4 defaults + api=openai-completions; bootstrap-integration.test.ts cases for DEEPSEEK_API_KEY and FRIDAY_DEEPSEEK_API_KEY auto-detect; absent-env negative case; auto-detect helper updated to scrub deepseek env vars.","PR #162 (merged 7e0e49df)","pnpm test passing 10638","test/unit/providers/model/friday-provider-templates.test.ts; test/integration/hub/friday-hub-bootstrap-integration.test.ts; test/_helpers/auto-detect-provider-env.ts",".friday/health/new-phases-spec.md §1","none"
12,"DeepSeek closed-loop","pnpm-lock.yaml sync","DONE","P0","43bf0eb modified package.json with 7 new deps and a postcss bump but did not update pnpm-lock.yaml; pnpm install --frozen-lockfile failed in fresh worktrees.","ERR_PNPM_OUTDATED_LOCKFILE on first checkout.","Ran pnpm install (without --frozen-lockfile) to regenerate lockfile; committed.","PR #162 (merged 7e0e49df)","pnpm-lock.yaml diff","pnpm-lock.yaml","NA","none"
13,"DeepSeek closed-loop","43bf0eb CI: WebSocket T1/T2 outer it() timeout 10s → 30s","DONE","P0","T2 in test/e2e/friday-full-e2e.test.ts often timed out at 10s in CI runners.","Inner setTimeout 10s + vitest default testTimeout 10s race; vitest aborted before inner timer could resolve.","Added explicit 30s vitest budget to T1 and T2 it() calls.","PR #162 commit 1c516330 (squashed into 7e0e49df)","CI logs from earlier failing main runs","test/e2e/friday-full-e2e.test.ts L2010 + L2073","NA","superseded by tasks 17 and 18"
14,"DeepSeek closed-loop","43bf0eb CI: secrets baseline pragma allowlist","DONE","P0","detect-secrets-hook flagged tests-overnight/phases/phase-P-provider-fallback.mjs and phase-R-rate-limit.mjs fake apiKeys.","CI secrets check failed with 'New secret detected'.","Added inline `// pragma: allowlist secret` comments to both lines; both keys are intentionally invalid test fixtures.","PR #162 commit 1c516330 (squashed into 7e0e49df)","CI secrets job log","tests-overnight/phases/phase-P-provider-fallback.mjs L21; tests-overnight/phases/phase-R-rate-limit.mjs L27","NA","none"
15,"DeepSeek closed-loop","OVERNIGHT-TASK-SUMMARY.csv (root) updated to mark 6 prior TODOs DONE","DONE","P0","Original TODOs 13-18 in <repo>/OVERNIGHT-TASK-SUMMARY.csv needed mark as DONE plus add 6 new DONE rows for closure work.","none","CSV now has 23 rows (1 header + 22 data) all DONE.","PR #162 commit 3030a9c9 (squashed into 7e0e49df)","OVERNIGHT-TASK-SUMMARY.csv at HEAD","OVERNIGHT-TASK-SUMMARY.csv","NA","none"
16,"Wave 1 Foundation","tests-overnight/lib/helpers.mjs — 8 cross-cutting helpers","DONE","P0","Need helpers shared across new phases: provisionTenant (HH), injectError (CC), snapshotStateDir (LL), bootSecondary/killSecondary (LL/KK), runMetaForRun (Y/FF), waitForGenerator (Y), createFakeFailProvider (P/MM), waitForMemoryExport (BB), listVerifiedProviders (dual-key).","none","260-line file with 8 helpers + 1 internal dirSizeBytes; all syntax-checked.","PR #163 (merged eb756e16)","tests-overnight/lib/helpers.mjs at HEAD","tests-overnight/lib/helpers.mjs",".friday/health/new-phases-spec.md §4","BUG O-3 (Y/Z/AA): waitForGenerator status-match list missing 'ready_for_review' (DISCOVERED Wave 2 FAST run, fix in progress)"
17,"Wave 1 Foundation","16 stub phase files Y/Z/AA/BB/CC/DD/EE/FF/GG/HH/II/JJ/KK/LL/MM/NN","DONE","P0","Each new phase needs a marker file so orchestrator's expectedPhases() gate enforces a marker per id, even before real impl arrives in later waves.","none","Each stub file uses a common template that returns SKIP 'phase X not yet implemented (Wave 1 stub)' with a low-severity anomaly note.","PR #163 (merged eb756e16)","16 marker JSON files in /tmp/friday-overnight-test/markers/{Y,Z,AA,BB,CC,DD,EE,FF,GG,HH,II,JJ,KK,LL,MM,NN}.complete.json","tests-overnight/phases/phase-{Y..NN}-*.mjs",".friday/health/new-phases-spec.md §2","Replace each stub with real impl in Waves 2-5"
18,"Wave 1 Foundation","tests-overnight/gauntlet.mjs scheduling expansion","DONE","P0","Gauntlet must import + schedule 16 new phases per spec §3 timetable.","none","Added 16 imports; expectedPhases() now lists 27+16=43 ids (FAST_MODE still skips J); Y/Z/AA piggy-back on the parallel cluster; BB/CC/DD/EE/FF/HH run as a parallel block after V; GG/II/JJ/KK/LL/MM/NN serial after D3.","PR #163 (merged eb756e16)","Wave 1 FAST gauntlet log shows 42/42 markers","tests-overnight/gauntlet.mjs",".friday/health/new-phases-spec.md §3","none"
19,"Wave 1 Foundation","WebSocket T2 inner setTimeout 10s → 25s","DONE","P0","After PR #162 outer 30s, inner 10s still raced and won most of the time on slow CI.","Inner timer fired at exactly 10s; failed with 'WebSocket test timed out'.","Bumped inner to 25s with comment explaining 5s margin to outer 30s.","PR #163 commit 7b80f879 (squashed into eb756e16)","Wave 1 PR #163 CI run 24976470056 vs 24976836590","test/e2e/friday-full-e2e.test.ts T1+T2 inner setTimeout","NA","superseded by task 20"
20,"Wave 1 Foundation","WebSocket T2 protocol fix (auth → hello, auth_ok → hello_ack, accept subscribed)","DONE","P0","Even with 25s inner timeout, T2 timed out at exactly 25012ms in CI; the test sent {type:'auth'} but Friday's realtime gateway only handles {type:'hello'} → {type:'hello_ack'}; subscribe was never reached because auth never matched.","T2 always timed out because protocol mismatch; never observed a hello_ack-typed reply.","Rewrote T2 to send {type:'hello',token}, accept hello_ack as auth confirmation, send discrete subscribe frame, accept subscribed as ack; resolve gracefully on error/close.","PR #163 commit 3d68f33a (squashed into eb756e16)","Local FRIDAY_E2E_CORE=1 vitest 'T2: WebSocket subscribe' 1 pass 225ms; CI 10/10 green","test/e2e/friday-full-e2e.test.ts T2 (~L2012-2080)","NA","none"
21,"Wave 1 Foundation","Wave 1 PR #163 merged into main after 10/10 CI green","DONE","blocker","User explicitly required 真实绿标 (real green CI) before merge; first PR #162 merged before CI completed (race) which was a bad pattern.","User course-corrected: 'auto merge 之前需要真实绿标'.","Set up Monitor poll on CI; only merged after all 10 checks (build, test, migrations, security, contracts, ssd-lint, alignment-guard, agent-parity, secrets, quality-gate) showed pass.","Merge commit eb756e16 at 2026-04-27T05:24:55Z","https://github.com/thesongzhu/Friday/pull/163","NA","NA","none"
22,"Wave 2 P0 (in progress)","Phase Y skill auto-acquisition full closed loop","FAILED","P0","End-to-end probe: Friday encounters a goal it cannot solve → generator produces draft → import → re-select → survive restart → self-evolve. Per spec §2 Phase Y.","Wave 2 FAST gauntlet step 2: 'generator did not become ready within 5 min'. Generator session reached status 'ready_for_review' (with goal accepted, draftSkillId='sha256-generator', specSummary populated) but my waitForGenerator polled for status='ready' or mode='draft'/'ready' — actual ready status is 'ready_for_review'.","IN PROGRESS: helpers.mjs waitForGenerator updated to also accept 'ready_for_review' and 'approved' statuses (uncommitted edit on main worktree branch claude/wave-1-foundation; needs to land in wave-2 worktree too). Phase Y deriveImportBodyFromGenerator also untested against real draft package shape.","branch claude/wave-2-p0 (uncommitted; helpers.mjs edit on main worktree branch claude/wave-1-foundation NOT YET PUSHED)","/tmp/friday-overnight-test/evidence/Y/step2-generator-ready.json (lastBody.data.session.status='ready_for_review'); .friday/health/wave2-fast.log","Friday-wave2/tests-overnight/phases/phase-Y-skill-auto-acquisition.mjs (250 lines); Friday-wave2/tests-overnight/lib/helpers.mjs",".friday/health/new-phases-spec.md §2 Phase Y","Update Friday-wave2 helpers.mjs to accept ready_for_review; verify deriveImportBodyFromGenerator matches actual draft package response shape; re-run FAST"
23,"Wave 2 P0 (in progress)","Phase Z adjustment fidelity / no-drift","FAILED","P0","After PATCH a managed skill or workflow node config, the next run reflects the change AND survives a wait window AND audit_logs records it.","Wave 2 FAST gauntlet step 0: 'workflow create did not return id' — POST /v1/workflows returned 400 VALIDATION_ERROR 'slug is required and must be a non-empty string'. My buildWorkflowDefinition omitted slug.","TODO: add slug field to buildWorkflowDefinition (Friday-wave2/tests-overnight/phases/phase-Z-adjust-fidelity.mjs) — 1 line change.","branch claude/wave-2-p0 (uncommitted)","/tmp/friday-overnight-test/evidence/Z/setup-workflow-create.json (status:400 code:VALIDATION_ERROR)","Friday-wave2/tests-overnight/phases/phase-Z-adjust-fidelity.mjs buildWorkflowDefinition()",".friday/health/new-phases-spec.md §2 Phase Z","Add slug + verify FAST"
24,"Wave 2 P0 (in progress)","Phase AA self-upgrade end-to-end","FAILED","P0","Skill v1.0.1 imported and serving (output.upgradedAt present, proving v1.0.1 IS running) but my fetchSkill returned version='unknown' because path data.skill.manifest.version is wrong shape.","Wave 2 FAST gauntlet step 3: 'version=unknown (expected 1.0.1), newField=true'. Skill /v1/skills/:id GET handler returns {skill: FridaySkillLifecycleDetail} where version is exposed via different field paths than I assumed.","TODO: investigate FridaySkillLifecycleDetail shape (src/skills/services/friday-skill-lifecycle-service.ts L208) and fix fetchSkill to use the correct path (likely catalogEntry.manifest.version or versions[0].version or installedVersion).","branch claude/wave-2-p0 (uncommitted)","/tmp/friday-overnight-test/evidence/AA/step3-skill-after-import.json (upgradedVersion:'unknown'); /tmp/friday-overnight-test/evidence/AA/step3-run-v101.json (output.upgradedAt='2026-04-27T05:38:50Z' confirms v1.0.1 IS the active route)","Friday-wave2/tests-overnight/phases/phase-AA-self-upgrade.mjs fetchSkill()",".friday/health/new-phases-spec.md §2 Phase AA","Find correct version path in skill response; fix + verify FAST"
25,"Wave 3 P1 (pending)","Phase BB long-term memory (1000 facts × 5 sessions × restart)","PENDING","P1","Insert 200 facts across 5 sessions, recall by id (50 sample) and by query (20 semantic), restart Friday, re-recall, sleep 30 min + insert 100 more, snapshot row counts.","NA","NA","NA","NA","spec only","spec .friday/health/new-phases-spec.md §2 Phase BB","Implement after Wave 2 P0 lands"
26,"Wave 3 P1 (pending)","Phase CC long-term self-heal multi-cycle","PENDING","P1","Inject 6 distinct error patterns × 3 each, snapshot learned_lessons + auto_fix_actions counts, assert >=6 each with at least one executed; restart and re-trigger one pattern; sleep 30 min and inject brand-new pattern.","NA","NA","NA","NA","Friday-wave2/tests-overnight/lib/helpers.mjs injectError() + 6 prompts",".friday/health/new-phases-spec.md §2 Phase CC","Implement"
27,"Wave 3 P1 (pending)","Phase DD workflow long-term + audit + security","PENDING","P1","Multi-node workflow with approval+branching runs 60 min at 1-min cadence; sample 10 runs; assert audit_logs covers transitions; bypass approval and assert 403 + audit; publish v2 and verify next run picks up v2.","NA","NA","NA","NA","NA",".friday/health/new-phases-spec.md §2 Phase DD","Implement"
28,"Wave 4 P2 (pending)","Phase EE context compression","PENDING","P2","Embed deterministic fact at turn 0, pad with ~100 dummy turns, ask 'what ID did I tell you?', assert reply contains the fact AND at least one compressed=true event was emitted.","NA","NA","NA","NA","NA",".friday/health/new-phases-spec.md §2 Phase EE","Implement"
29,"Wave 4 P2 (pending)","Phase FF cost / token accuracy","PENDING","P2","Run 30 chat turns; for each fetch runMeta.costUsd + actualModel + usage; compute expected cost from catalog pricing; assert per-turn diff < $0.0001 cumulative < $0.001.","NA","NA","NA","NA","Friday-wave2/tests-overnight/lib/helpers.mjs runMetaForRun()",".friday/health/new-phases-spec.md §2 Phase FF","Implement"
30,"Wave 4 P2 (pending)","Phase GG cron precision under load","PENDING","P2","Per-minute cron job heartbeat to file; fire 50 concurrent chat sessions for 30 min in parallel; count tick lines; max drift < 30s; tick count between 28-32.","NA","NA","NA","NA","NA",".friday/health/new-phases-spec.md §2 Phase GG","Implement"
31,"Wave 4 P2 (pending)","Phase HH multi-tenant isolation","PENDING","P2","Friday is single-admin design; spec adjusted to multi-tenant via /v1/security/tenants. Provision two tenants; cross-write secrets; assert no leakage; assert 403 on cross-tenant reads + audit_logs entries.","NA","NA","NA","NA","Friday-wave2/tests-overnight/lib/helpers.mjs provisionTenant()",".friday/health/new-phases-spec.md §2 Phase HH + §6 'HH multi-user → multi-tenant pivot'","Implement"
32,"Wave 5 P3 (pending)","Phase II cross-channel handoff","PENDING","P3","Webchat conversation continues into LINE/email/discord with the same userId; preferences learned in webchat reflected in other channel reply. SKIP cleanly per channel without credentials. User authorized SKIP-only is acceptable.","NA","NA","NA","NA","NA",".friday/health/new-phases-spec.md §2 Phase II","Implement"
33,"Wave 5 P3 (pending)","Phase JJ voice loop (TTS + STT round-trip + voice action)","PENDING","P3","Friday has no /v1/voice/* routes; spec adjusted to call OpenAI tts-1 + whisper-1 directly via the configured OpenAI provider. SKIP if no OpenAI provider verified.","NA","NA","NA","NA","NA",".friday/health/new-phases-spec.md §2 Phase JJ + §6 'JJ voice'","Implement"
34,"Wave 5 P3 (pending)","Phase KK MCP restart resilience","PENDING","P3","Mock MCP at boot; run a tool; restart Friday; re-run same tool; mid-restart fire a tool call and assert MCP_DISCONNECTED 5xx within 5s (no hang).","NA","NA","NA","NA","NA",".friday/health/new-phases-spec.md §2 Phase KK","Implement"
35,"Wave 5 P3 (pending)","Phase LL backup / restore","PENDING","P3","rsync STATE_DIR to backup-dir with WAL checkpoint; boot a secondary Friday on different port from backup-dir; compare row counts and 10 sample queries.","NA","NA","NA","NA","Friday-wave2/tests-overnight/lib/helpers.mjs snapshotStateDir() + bootSecondary()",".friday/health/new-phases-spec.md §2 Phase LL","Implement"
36,"Wave 5 P3 (pending)","Phase MM provider failover under sustained load","PENDING","P3","60 concurrent sessions for 20 min; primary verified provider + fallback always-failing; mid-run swap; error rate must stay <1%.","NA","NA","NA","NA","Friday-wave2/tests-overnight/lib/helpers.mjs createFakeFailProvider() (already supports 401 and 500 variants)",".friday/health/new-phases-spec.md §2 Phase MM + §6 'MM always-failing fallback'","Implement"
37,"Wave 5 P3 (pending)","Phase NN permission propagation","PENDING","P3","Grant user A memory.read; verify reads ok writes blocked; revoke read add write; sleep 5s; verify writes ok reads blocked + audit_logs has change; restart Friday; verify new grant set still in effect.","NA","NA","NA","NA","NA",".friday/health/new-phases-spec.md §2 Phase NN","Implement"
38,"Wave 6 (pending)","Final mega 40-phase FULL gauntlet end-to-end","PENDING","blocker","All 40 phases (27 existing + 13 newly real-implemented Y/Z/AA/BB/CC/DD/EE/FF/GG/HH/II/KK/LL/MM/NN — JJ may SKIP without TTS) under FULL mode (Phase B 3h, Phase W 1h, Phase R full timing). ETA ~6-8h wall.","NA","NA","NA","NA","tests-overnight/gauntlet.mjs","NA","Run after Wave 5 P3 merges"
39,"Open issue O-1","Phase V SSRF FAILs (not SKIPs) when ZERO providers configured","OPEN","low","With no providers at all (no key in env), agent reply is 'NO MODEL ROUTING CONFIGURED. REGISTER A PROVIDER…' which doesn't match isProviderPreconditionFailure regex set, so V FAILs instead of SKIPping.","Observed during Wave 1 no-key smoke gauntlet (commit-less).","NOT FIXED. Mechanical 1-line addition to tests-overnight/lib/util.mjs isProviderPreconditionFailure: add /no model routing configured/i to the regex set. Falls under contract A=b. Plan: add when Wave 4 touches util.mjs.","NA","Wave 1 no-key smoke run; .friday/health/wave1-fast-nokey.log","tests-overnight/lib/util.mjs isProviderPreconditionFailure",".friday/health/open-issues.md O-1","Add regex when next touching util.mjs"
40,"Open issue O-2 (RESOLVED)","Phase P SKIP under dual-provider verified setup","RESOLVED","medium","Initially appeared as a regression: with both DEEPSEEK_API_KEY+OPENAI_API_KEY in env, Phase P picked DeepSeek as 'real' provider, fake-fail kind=deepseek, fallback to real DeepSeek; chat returned 'temporary connection issue'.","Discovered Wave 1 FAST run #1; rationalized as cooldown bucket sharing — was wrong.","ACTUAL ROOT CAUSE: dist/providers/model/friday-provider-catalog.js was stale (still had openai-responses for DeepSeek); after rebuilding dist (pnpm run build:api), capability doctor properly verified deepseek:text and Phase P PASSed with reply '99' as expected.","NA","Compare /tmp/friday-overnight-test/markers/P.complete.json across runs; 'deepseek text capability probe failed with HTTP 404' was the smoking gun.","Always rebuild dist before running gauntlet against new src changes",".friday/health/open-issues.md O-2","Lesson learned: every src change → pnpm run build:api"
41,"Open issue O-3","managed-skills/stab-skill leftover artifact","OPEN","cosmetic","Phase M imports stab-skill into managed-skills/stab-skill/ (in repo root) every gauntlet run. Untracked but persists across runs.","NA","Add managed-skills/stab-skill/ to .gitignore in any later Wave PR.","NA","ls managed-skills/ shows stab-skill alongside intentional repo skills","NA",".friday/health/open-issues.md O-3","Add to .gitignore"
42,"Open issue O-4","launchctl Friday on port 3141 may confuse human debugging","OPEN","cosmetic","com.friday.hub launchd agent runs separately on port 3141 with its own state-dir; gauntlet uses 3144/3145 so no interference, but log debugging can confuse the two.","NA","Not fixing; only matters for human debugging.","NA","launchctl list | grep friday","NA",".friday/health/open-issues.md O-4","No action"
43,"Open issue (NEW)","Old OPENAI_API_KEY in ~/.zshrc was invalid (ends in j28A)","RESOLVED","medium","Phase P log earlier showed 401 from OpenAI key 'sk-[REDACTED]' — this was the user's stale key in ~/.zshrc line 1, not the new one shared in chat (sk-[REDACTED]).","Diagnosed via Friday server log line 'Incorrect API key provided: sk-[REDACTED]' which masked middle but exposed suffix.","User instructed to either replace ~/.zshrc line 1 or put both keys in .claude/settings.local.json env block. We chose the latter (less restart impact); both keys now in env per Claude Code session inheritance.","NA",".claude/settings.local.json env block","~/.zshrc line 1 (old key still there); .claude/settings.local.json env block (new keys)","NA","User decides whether to clean up ~/.zshrc old key"
44,"Open issue (NEW)","main branch protection missing required CI checks","RESOLVED","high","Without branch protection, gh pr merge --auto merged PRs immediately even if CI hadn't completed (PR #162 merged while build was IN_PROGRESS, getting lucky).","User noted: 'auto merge 之前需要真实绿标'.","User manually applied the branch protection rule (10 required status checks: build, test, migrations, security, contracts, ssd-lint, alignment-guard, agent-parity, secrets, quality-gate). Verified via gh api /repos/thesongzhu/Friday/branches/main/protection.","NA","gh api response listing the 10 required checks","NA","NA","Maintained going forward"
45,"Open issue (NEW)","WebSocket T2 protocol mismatch is pre-existing on main since 43bf0eb","RESOLVED","medium","T2 sent {type:'auth'} but Friday's gateway only handles {type:'hello'}. Made test always wait full timeout. Was hidden because outer 30s + inner 10s race usually completed before vitest abort.","Discovered via PR #163 first CI run showing test FAIL at 25012ms exactly.","Fixed in PR #163 commit 3d68f33a (squashed into eb756e16) — see task 20.","commit 3d68f33a (squashed into eb756e16)","CI logs PR #163 run 24976470056","test/e2e/friday-full-e2e.test.ts T2","NA","none"
46,"Process learning","Always rebuild dist after src/ changes","RESOLVED","high","Wave 1 FAST gauntlet first run with src changes used STALE dist; deepseek text capability probe 404'd because dist still had openai-responses. Caused Phase P false SKIP and Phase V false FAIL.","Discovered when querying provider runtimeCapabilities via sqlite3 and finding 'Capability probe failed with HTTP 404'.","Process change: every src change → pnpm run build:api before any gauntlet test. Operating contract addendum.","NA","comparison of failed run vs after-rebuild run","NA",".friday/health/operating-contract.md","Always run build:api"
47,"Process learning","Always read API contracts before writing test phase code","IN_PROGRESS","high","Wave 2 P0 phases (Y/Z/AA) all FAILED on first FAST run because I assumed API response shapes without reading the actual code. Y assumed status='ready' (actual 'ready_for_review'); Z omitted required slug; AA assumed wrong skill version path.","Three independent bugs in three new phase files, each from same root cause: writing without verifying.","Operating contract addendum: before any new phase, read the actual route handler + return type to verify expected response shape. Currently I'm fixing Y/Z/AA in branch claude/wave-2-p0.","NA","Wave 2 FAST run /tmp/friday-overnight-test/evidence/{Y,Z,AA}/step*.json","NA",".friday/health/operating-contract.md (to be added)","Apply read-first discipline going forward"
48,"Documents","Wave 2-5 phase spec (single source of truth)","DONE","P0","Detailed spec for all 40 phases (27 existing + 15 new) with pass/fail criteria, anomaly classifications, FAST_MODE behavior, scheduling notes, and §6 endpoint adjustments resolved during recon (cron, embeddings, voice, MM fake-fail, HH multi-tenant pivot, Y self-evolve via re-generation).","NA","NA","NA","NA",".friday/health/new-phases-spec.md (1500+ lines)",".friday/health/new-phases-spec.md","NA","Will become docs/specs/overnight-gauntlet.md when first new phase lands"
49,"Documents","Operating contract (autonomous mode rules)","DONE","P0","User-defined STOP triggers, force-push policy, spec ambiguity rules, reporting cadence, secret discipline, etc.","NA","NA","NA","NA",".friday/health/operating-contract.md",".friday/health/operating-contract.md","NA","Update with each Wave's lessons"
50,"Documents","Open issues tracker","DONE","P0","Lightweight scratch list of non-blocking observations during each Wave's verification.","NA","NA","NA","NA",".friday/health/open-issues.md",".friday/health/open-issues.md","NA","Update each Wave"
51,"Documents","OVERNIGHT-TASK-SUMMARY.csv (canonical, in repo)","DONE","P0","Source of truth for original 6 TODOs + the 6 closure items from PR #162.","NA","NA","NA","NA","OVERNIGHT-TASK-SUMMARY.csv at HEAD","OVERNIGHT-TASK-SUMMARY.csv","NA","Will be appended in each Wave's PR"
52,"Build artifact","dist/ rebuild on every src/ change","ONGOING","P0","dist/ is gitignored and built locally; getting out-of-sync caused Wave 1 false-positive Phase V FAIL and false-negative Phase P SKIP.","Stale dist gave wrong DeepSeek api routing.","Always run pnpm run build:api after pulling main or editing src/.","NA","dist/cli/friday-cli.js mtime vs src mtime","dist/","NA","Process discipline"
53,"Cost","Estimated spend so far","TRACKED","P0","User-set hard cap $100. Estimating from token usage (no real-time dashboard).","NA","DeepSeek FAST gauntlets ~5; FULL gauntlet 4h35m; OpenAI capability probes + chat phases. Estimated total so far: $5-15. Within cap.","NA","Token-usage estimate from gauntlet logs","NA","NA","Track per Wave"
54,"Open issue (NEW)","Wave 2 P0 phase Y deriveImportBodyFromGenerator unverified","OPEN","medium","Phase Y step 3 may also fail because deriveImportBodyFromGenerator assumes ready.body.data.session.draftPackage.uri or .bundle.uri; actual generator response may use draftSkillId='sha256-generator' string only (no package uri). Approve route fallback also untested.","Generator session shows draftSkillId='sha256-generator' but no draftPackage/bundle field at the assumed paths.","NOT FIXED. After fixing waitForGenerator status (task 22), need to either inspect actual generator response shape and update path, or add a /v1/skills/generator/sessions/:id/approve call that materializes the package.","branch claude/wave-2-p0 (uncommitted)","/tmp/friday-overnight-test/evidence/Y/step2-generator-ready.json","Friday-wave2/tests-overnight/phases/phase-Y-skill-auto-acquisition.mjs deriveImportBodyFromGenerator()","NA","Read /v1/skills/generator/sessions/:id/approve route handler; map response to import payload"