Skip to content

Commit 4382dd3

Browse files
authored
Merge pull request #59 from m0n0x41d/haft/6.1.0
Haft/6.1.0
2 parents 19faffa + 3e8f55b commit 4382dd3

64 files changed

Lines changed: 5711 additions & 355 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/ci.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,12 @@ jobs:
8787
exit 1
8888
fi
8989
90+
- name: Sync tracked governance artifacts
91+
run: ./haft sync
92+
93+
- name: Check governance debt
94+
run: ./haft check
95+
9096
- name: Upload coverage
9197
uses: codecov/codecov-action@v5
9298
with:

CHANGELOG.md

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,34 @@ All notable changes to this project will be documented in this file.
44

55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
66

7-
## [Unreleased]
7+
## [6.1.0] — 2026-04-14
8+
9+
### Added
10+
11+
- **`haft check` CLI command** — CI-friendly governance verification. Runs stale scan, drift scan, unassessed decisions, coverage gaps. Exit 0 = clean, exit 1 = findings. `--json` flag for structured output.
12+
- **Full governance state in `/h-verify`** — scan now surfaces pending problems (backlog/in-progress count), addressed problems without linked decisions, and invariant violations from knowledge graph. Single entry point for "what needs attention."
13+
- **`.haft/workflow.md` support** — hybrid markdown+YAML project policy file. Parsed at serve/agent startup. Intent + Defaults injected into agent prompts. `haft init` creates commented example.
14+
- **Problem typing on ProblemCard**`problem_type` field: optimization, diagnosis, search, synthesis. Accepted on frame, stored in DB, shown in `/h-status` and `/h-problems`.
15+
- **Derived decision health model** — replaces single "phase" with two independent axes: Maturity (Unassessed / Pending / Shipped) and Freshness (Healthy / Stale / AT RISK). Freshness evaluated only for Shipped decisions. Never stored — computed at query time.
16+
- **Claim-scoped evidence supersession** — new measurement supersedes only previous measurements for the same `(claim_ref, observable)`, not all measurements on the decision. Prevents unrelated evidence from being retired.
17+
- **Claim-scoped R_eff**`R_eff(decision) = min(R_eff(claim_i))` where each claim's R_eff is computed from its own evidence. More precise than decision-level aggregation.
18+
- **F_eff / G_eff decomposition** — Formality (F0–F3) and Groundedness (CL-derived) exposed as view concerns alongside R_eff for evidence diagnosis.
19+
- **Deep onboard for legacy projects**`/h-onboard` now runs module coverage analysis and deep scans blind modules: reads code, identifies responsibilities, invariants, implicit decisions, risks. Supports parallel subagent execution when available.
20+
21+
### Changed
22+
23+
- **"No evidence = Unassessed"** — decisions without evidence are shown separately from healthy decisions, not treated as fresh. UI surfaces coverage gaps.
24+
- **Verdict vocabulary normalized** — measurement result aliases (`accepted`/`partial`/`failed`) mapped to canonical evidence verdicts (`supports`/`weakens`/`refutes`) at storage boundary.
25+
- **CL0 + supports = inadmissible** — evidence from opposed context with verdict `supports` is rejected at ingest, not merely penalized.
26+
- **G1 enforced: one active decision per problem**`Decide()` rejects if another active DecisionRecord exists for the same problem_ref.
27+
- **G2: parity plan warnings**`haft_solution(action="compare")` in standard/deep mode warns if parity plan is empty or unstructured.
28+
- **G4: subjective dimension warnings** — compare warns on dimensions like "maintainable", "simple", "scalable" — asks to decompose into measurables or tag as observation-only.
29+
- **Core boundary enforced** — integration tests verify Core packages (`internal/artifact`, `graph`, `fpf`, `reff`, `codebase`) have zero `desktop/` imports.
30+
31+
### Fixed
32+
33+
- **Desktop: oversized task output tails bounded** — prevents UI freeze on large agent outputs.
34+
- **Knowledge graph integration tests** — FindDecisionsForFile, FindInvariantsForFile, ComputeImpactSet tested on seeded DB with real project data.
835

936
## [6.0.0] — 2026-04-13
1037

README.md

Lines changed: 66 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,18 +2,20 @@
22

33
*formerly [quint-code](https://github.com/m0n0x41d/quint-code)*
44

5-
**Engineering decisions that know when they're stale.**
5+
**True harness engineering for AI-assisted software delivery.**
66

7-
Frame problems. Compare options fairly. Record decisions as contracts. Know when to revisit.
7+
Your agents write code fast. Nobody checks if the decisions behind that code are any good — or still valid a month later. Haft does.
88

99
---
1010

1111
## What is Haft?
1212

13-
Haft is a local-first engineering governor for software projects. It helps engineers frame problems before solving them, compare options honestly, record decisions as contracts with invariants, track evidence with decay, and know when to revisit.
13+
Haft is the engineering governor that sits between your intentions and your agents' execution. It enforces the discipline that separates "we shipped fast" from "we shipped right": frame the problem before solving it, compare options under parity, record decisions as falsifiable contracts, and know the moment assumptions go stale.
1414

1515
**Think → Run → Govern.**
1616

17+
Not a coding agent. Not a documentation tool. Not a project manager. The handle between the tool and the hand — the part that turns raw capability into directed engineering work.
18+
1719
### Two primary surfaces
1820

1921
- **Desktop app** — visual cockpit for reasoning state, agent orchestration, and governance dashboard
@@ -74,6 +76,30 @@ The binary is the same — only the MCP config and command/prompt installation l
7476

7577
Existing project? Run `/h-onboard` after init — the agent scans your codebase for existing decisions worth capturing.
7678

79+
## CI
80+
81+
Use `haft check` anywhere you want a pass/fail signal for governance debt:
82+
83+
```yaml
84+
# .github/workflows/haft-check.yml
85+
steps:
86+
- uses: actions/checkout@v4
87+
88+
- name: Install haft
89+
run: |
90+
curl -fsSL https://raw.githubusercontent.com/m0n0x41d/haft/main/install.sh | bash
91+
echo "$HOME/.local/bin" >> "$GITHUB_PATH"
92+
93+
- name: Check governance debt
94+
run: haft check
95+
```
96+
97+
`haft check` scans stale artifacts, drifted decisions, unassessed decisions, and coverage gaps.
98+
Exit `0` means the project is clean. Governance findings exit `1`. Command or setup errors also
99+
fail the job with a non-zero exit code, which keeps CI badges red for both unhealthy and broken states.
100+
101+
Need machine-readable output? Run `haft check --json`.
102+
77103
---
78104

79105
## How It Works
@@ -147,6 +173,43 @@ Features: dashboard with governance findings, problem board, decision detail wit
147173

148174
---
149175

176+
## Roadmap
177+
178+
### v6.1 — Harden the Contract (shipped)
179+
180+
Decision quality enforcement before automating execution:
181+
- `haft check` for CI governance verification
182+
- `/h-verify` surfaces full governance state (problems, invariants, drift — not just decisions)
183+
- `.haft/workflow.md` — repo-level agent policy, injected into every prompt
184+
- Problem typing (optimization / diagnosis / search / synthesis)
185+
- G1/G2/G4 enforcement: one decision per problem, parity warnings, subjective dimension detection
186+
- CL0+supports rejection, claim-scoped R_eff and evidence supersession
187+
- Deep `/h-onboard` with module-by-module analysis for legacy projects
188+
189+
### v6.2 — Dashboard + Execution Primitives (next)
190+
191+
The desktop becomes an operator surface, not just a viewer:
192+
- **Unified Dashboard** — active decisions, governance findings, automations in one view
193+
- **Implement** — click a decision, agent spawns in worktree with full reasoning context
194+
- **Adopt** — governance finding (stale/drifted) → agent thread for interactive resolution
195+
- **Automation triggers** — CI fail, dependency update, scheduled → auto-create ProblemCards
196+
- **DDR→Task Pipeline** — Implement generates subtasks from decision, runs sequentially with auto-advance
197+
- **Deep onboard**`/h-onboard --deep` generates task plan from coverage gaps
198+
199+
### v7 — Desktop Loop MVP
200+
201+
One proved cycle: **Decision → Implement → Verify → Baseline → PR draft**. If verification fails → reopen as ProblemCard, not straight to PR. Local-first PR output.
202+
203+
### v8 — Governor Signals
204+
205+
Background detection loops (stale, drift, dependencies) with dashboard alerts. Autonomous actuation only after trust is earned through detect-only phase.
206+
207+
### Not on the roadmap
208+
209+
Cloud/SaaS. Mobile app. Slack bot. Browser extension. General personal assistant. Competing with Claude Code on code editing. The product is the engineering governor, not another surface.
210+
211+
---
212+
150213
## Requirements
151214

152215
- Go 1.25+ (for building from source)

desktop/agents.go

Lines changed: 58 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ const (
2626

2727
const (
2828
taskOutputMaxLines = 500
29+
taskOutputMaxChars = 64000
2930
taskOutputFlushInterval = 350 * time.Millisecond
3031
)
3132

@@ -53,8 +54,8 @@ type TaskState struct {
5354
StartedAt string `json:"started_at"`
5455
CompletedAt string `json:"completed_at"`
5556
ErrorMessage string `json:"error_message"`
56-
Output string `json:"output"` // bounded output tail
57-
AutoRun bool `json:"auto_run"` // true = agent runs without pausing
57+
Output string `json:"output"` // bounded output tail
58+
AutoRun bool `json:"auto_run"` // true = agent runs without pausing
5859
}
5960

6061
type TaskOutputEvent struct {
@@ -464,14 +465,30 @@ func (b *taskOutputBuffer) Append(chunk string) string {
464465
b.lines = append([]string(nil), b.lines[len(b.lines)-b.maxLines:]...)
465466
}
466467

467-
return b.snapshotLocked()
468+
snapshot := b.snapshotLocked()
469+
normalized := normalizeTaskOutput(snapshot)
470+
471+
if normalized != snapshot {
472+
b.lines = nil
473+
b.partial = normalized
474+
}
475+
476+
return normalized
468477
}
469478

470479
func (b *taskOutputBuffer) String() string {
471480
b.mu.Lock()
472481
defer b.mu.Unlock()
473482

474-
return b.snapshotLocked()
483+
snapshot := b.snapshotLocked()
484+
normalized := normalizeTaskOutput(snapshot)
485+
486+
if normalized != snapshot {
487+
b.lines = nil
488+
b.partial = normalized
489+
}
490+
491+
return normalized
475492
}
476493

477494
func (b *taskOutputBuffer) snapshotLocked() string {
@@ -487,6 +504,43 @@ func (b *taskOutputBuffer) snapshotLocked() string {
487504
return strings.Join(parts, "\n")
488505
}
489506

507+
func normalizeTaskOutput(output string) string {
508+
bounded := trimTaskOutputLines(output, taskOutputMaxLines)
509+
bounded = trimTaskOutputRunes(bounded, taskOutputMaxChars)
510+
return bounded
511+
}
512+
513+
func trimTaskOutputLines(output string, maxLines int) string {
514+
if output == "" || maxLines <= 0 {
515+
return output
516+
}
517+
518+
lines := strings.Split(output, "\n")
519+
520+
if len(lines) <= maxLines {
521+
return output
522+
}
523+
524+
start := len(lines) - maxLines
525+
tail := lines[start:]
526+
return strings.Join(tail, "\n")
527+
}
528+
529+
func trimTaskOutputRunes(output string, maxRunes int) string {
530+
if output == "" || maxRunes <= 0 {
531+
return output
532+
}
533+
534+
runes := []rune(output)
535+
536+
if len(runes) <= maxRunes {
537+
return output
538+
}
539+
540+
start := len(runes) - maxRunes
541+
return string(runes[start:])
542+
}
543+
490544
// --- App binding methods ---
491545

492546
// DetectAgents finds installed coding agents.

desktop/agents_test.go

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
package main
2+
3+
import (
4+
"fmt"
5+
"strings"
6+
"testing"
7+
"unicode/utf8"
8+
)
9+
10+
func TestTaskOutputBufferKeepsNewestLongSingleLine(t *testing.T) {
11+
buffer := newTaskOutputBuffer(taskOutputMaxLines, "")
12+
head := "STARTMARKER"
13+
tail := strings.Repeat("tail", 2000) + "ENDMARKER"
14+
body := strings.Repeat("H", taskOutputMaxChars)
15+
longLine := head + body + tail
16+
17+
got := buffer.Append(longLine)
18+
19+
if utf8.RuneCountInString(got) > taskOutputMaxChars {
20+
t.Fatalf("expected output <= %d runes, got %d", taskOutputMaxChars, utf8.RuneCountInString(got))
21+
}
22+
23+
if strings.Contains(got, "STARTMARKER") {
24+
t.Fatalf("expected oldest prefix marker to be trimmed from output")
25+
}
26+
27+
if !strings.HasSuffix(got, "ENDMARKER") {
28+
t.Fatalf("expected newest output tail to be preserved, got suffix %q", got[maxInt(len(got)-32, 0):])
29+
}
30+
}
31+
32+
func TestNormalizeTaskOutputKeepsNewestLines(t *testing.T) {
33+
lines := make([]string, 0, taskOutputMaxLines+25)
34+
35+
for i := range taskOutputMaxLines + 25 {
36+
lines = append(lines, fmt.Sprintf("line-%03d", i))
37+
}
38+
39+
output := strings.Join(lines, "\n")
40+
got := normalizeTaskOutput(output)
41+
gotLines := strings.Split(got, "\n")
42+
43+
if len(gotLines) != taskOutputMaxLines {
44+
t.Fatalf("expected %d lines after normalization, got %d", taskOutputMaxLines, len(gotLines))
45+
}
46+
47+
if gotLines[0] != "line-025" {
48+
t.Fatalf("expected first retained line line-025, got %q", gotLines[0])
49+
}
50+
51+
if gotLines[len(gotLines)-1] != "line-524" {
52+
t.Fatalf("expected last retained line line-524, got %q", gotLines[len(gotLines)-1])
53+
}
54+
}
55+
56+
func maxInt(a int, b int) int {
57+
if a > b {
58+
return a
59+
}
60+
61+
return b
62+
}

desktop/app.go

Lines changed: 21 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -145,17 +145,29 @@ func (a *App) GetDashboard() (*DashboardView, error) {
145145
stale, _ := a.store.FindStaleArtifacts(a.ctx)
146146
notes, _ := a.store.ListActiveByKind(a.ctx, artifact.KindNote, 50)
147147
portfolios, _ := a.store.ListActiveByKind(a.ctx, artifact.KindSolutionPortfolio, 100)
148+
statusData, err := artifact.FetchStatusData(a.ctx, a.store, "")
149+
if err != nil {
150+
return nil, err
151+
}
152+
153+
healthyDecisions := mapArtifacts(statusData.HealthyDecisions, toDecisionView, 8)
154+
pendingDecisions := mapArtifacts(statusData.PendingDecisions, toDecisionView, 8)
155+
unassessedDecisions := mapArtifacts(statusData.UnassessedDecisions, toDecisionView, 8)
156+
recentDecisions := mapArtifacts(decisions, toDecisionView, 8)
148157

149158
return &DashboardView{
150-
ProjectName: a.projectName,
151-
ProblemCount: len(problems),
152-
DecisionCount: len(decisions),
153-
PortfolioCount: len(portfolios),
154-
NoteCount: len(notes),
155-
StaleCount: len(stale),
156-
RecentProblems: mapArtifacts(problems, toProblemView, 8),
157-
RecentDecisions: mapArtifacts(decisions, toDecisionView, 8),
158-
StaleItems: mapArtifacts(stale, toArtifactView, 10),
159+
ProjectName: a.projectName,
160+
ProblemCount: len(problems),
161+
DecisionCount: len(decisions),
162+
PortfolioCount: len(portfolios),
163+
NoteCount: len(notes),
164+
StaleCount: len(stale),
165+
RecentProblems: mapArtifacts(problems, toProblemView, 8),
166+
RecentDecisions: safeDecisionViews(recentDecisions),
167+
HealthyDecisions: safeDecisionViews(healthyDecisions),
168+
PendingDecisions: safeDecisionViews(pendingDecisions),
169+
UnassessedDecisions: safeDecisionViews(unassessedDecisions),
170+
StaleItems: mapArtifacts(stale, toArtifactView, 10),
159171
}, nil
160172
}
161173

0 commit comments

Comments
 (0)