You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: tier-check branch support and draft/extension scoring (#153)
* fix: pass --branch to tier-check CLI in tier-audit skill
The skill was not forwarding the --branch argument to the tier-check
CLI, causing policy signal checks to always run against the repo's
default branch. Files on feature branches showed as 'Not found.'
Now derives the branch from the local checkout if not explicitly
provided, and always passes it to the CLI.
* fix: exclude draft/extension scenarios from tier-scoring conformance rates
SEP-1730 says date-versioned scenarios count toward tier scoring while
draft and extension scenarios are informational. The CLI was including
all scenarios in pass_rate, causing extension-only failures to block
Tier 1.
Changes:
- pass_rate now only counts scenarios with at least one date-versioned
spec version
- Terminal and markdown output split into a tier-scoring matrix
(date-versioned + All*) and an informational section (draft/extension)
that only renders when there are draft/extension scenarios
-**local-path**: absolute path to the SDK checkout (e.g. `~/src/mcp/typescript-sdk`)
44
44
-**conformance-server-url**: URL where the SDK's everything server is already running (e.g. `http://localhost:3000/mcp`)
45
45
-**client-cmd** (optional): command to run the SDK's conformance client (e.g. `npx tsx test/conformance/src/everythingClient.ts`). If not provided, client conformance tests are skipped and noted as a gap in the report.
46
+
-**branch** (optional): Git branch to check on GitHub (e.g. `--branch fweinberger/v1x-governance-docs`). If not provided, derive from the local checkout's current branch: `cd <local-path> && git rev-parse --abbrev-ref HEAD`. This is passed to the tier-check CLI so that policy signal file checks use the correct branch instead of the repo's default branch.
46
47
47
48
The first two arguments are required. If either is missing, ask the user to provide it.
48
49
@@ -59,12 +60,13 @@ The `tier-check` CLI handles all deterministic checks — server conformance, cl
If no client-cmd was detected, omit the `--client-cmd` flag (client conformance will be skipped).
69
+
If no client-cmd was detected, omit the `--client-cmd` flag (client conformance will be skipped). The `--branch` flag should always be included (derived from the local checkout if not explicitly provided).
68
70
69
71
The CLI output includes server conformance pass rate, client conformance pass rate (with per-spec-version breakdown), issue triage compliance, P0 resolution times, label taxonomy, stable release status, policy signal files, and spec tracking gap. Parse the JSON output to feed into Step 4.
70
72
@@ -115,8 +117,8 @@ Combine the deterministic scorecard (from the CLI) with the evaluation results (
115
117
116
118
### Tier 1 requires ALL of:
117
119
118
-
- Server conformance test pass rate == 100%
119
-
- Client conformance test pass rate == 100%
120
+
- Server conformance test pass rate == 100% (date-versioned scenarios only; `draft` and `extension` are informational and not scored)
121
+
- Client conformance test pass rate == 100% (date-versioned scenarios only; `draft` and `extension` are informational and not scored)
120
122
- Issue triage compliance >= 90% within 2 business days
121
123
- All P0 bugs resolved within 7 days
122
124
- Stable release >= 1.0.0 with no pre-release suffix
@@ -127,8 +129,8 @@ Combine the deterministic scorecard (from the CLI) with the evaluation results (
127
129
128
130
### Tier 2 requires ALL of:
129
131
130
-
- Server conformance test pass rate >= 80%
131
-
- Client conformance test pass rate >= 80%
132
+
- Server conformance test pass rate >= 80% (date-versioned scenarios only)
The tier-scoring table only includes date-versioned scenarios. `draft` and `extension` scenarios are shown separately as informational — they do not affect tier advancement.
159
169
160
170
This immediately shows where failures concentrate. Failures clustered in Client: Auth / `2025-11-25` means "new auth features not yet implemented" — a scope gap, not a quality problem. Failures in Server or Client: Core are more concerning.
161
171
@@ -205,15 +215,21 @@ After the subagents finish, output a short executive summary directly to the use
0 commit comments