Skip to content

Commit 0d406d8

Browse files
authored
fix(macos): tri-state duration fields in tray Settings + mandatory QA merge gate (#617)
* fix(macos): tri-state duration fields in tray Settings (spec 074) The native tray Settings 'Tool discovery & health checks' section collapsed the nil<->"" tri-state lossily, producing three bugs: Save always disabled (validation rejected empty optional durations), a hardcoded "2m" placeholder instead of the real defaults, and a false '2 unsaved changes' on first open (valuesEqual("", nil) was false). - validation respects optional (blank optional duration = inherit default) - ConfigField.placeholder; discovery fields show real defaults 30s / 5m - valuesEqual treats nil/NSNull/"" as one blank class - optionalStringBinding/optionalScalarStored: cleared field stored as NSNull -> PATCH sends JSON null (reset-to-default), never an unparseable "" - web parity: same optional fix in fields.ts + blank->null in SettingField.vue - 9 Swift tests + 1 vitest Refs MCP-1214 * ci(qa): mandatory native test + settings-parity + qa-gate merge gate Makes feature verification a required pre-merge check, closing the gap that let MCP-1214 ship (native-only tray bug that Web-UI-only QA never exercised). - native-tests.yml: required checks swift-test (native form logic) and settings-parity (web<->native duration-field catalog drift) - check-settings-parity.py: canary; also fixed a 2nd divergence it found (call_tool_timeout native placeholder) - post-qa-gate-status.sh: SHA-keyed qa-gate commit status the QATester posts - mcpproxy-qa skill: native tray = mandatory surface; interactive cannot_verify is a BLOCK, not a low-risk pass; swift-test fallback when capture is blocked - docs/qa-merge-gate.md: design + ordered branch-protection activation Keeps the gh pr merge --admin escape hatch (enforce_admins stays off). Refs MCP-1214 * ci(qa-gate): make native-tests required-safe (MCP-1219) Remove the top-level paths: filter from native-tests.yml so the workflow runs on every PR, and gate swift-test/settings-parity on a dorny/paths-filter 'changes' job via a job-level if:. On non-native PRs the two jobs are skipped, which reports their required contexts as green — whereas a workflow skipped by a top-level paths: filter produces no status and would block every non-native PR once the contexts are marked required. Job names (swift-test, settings-parity) are unchanged so the required-context names stay stable. Updates docs/qa-merge-gate.md to document the required-safe design and require a non-native-PR verification before adding the contexts to branch protection.
1 parent 5d99f7e commit 0d406d8

10 files changed

Lines changed: 615 additions & 8 deletions

File tree

.github/workflows/native-tests.yml

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# Native macOS tray test gate (MCP-1214).
2+
#
3+
# The macOS tray Settings form is a separate hand-port of the Web UI catalog,
4+
# so web-only QA cannot catch native-only behavior bugs (validation, dirty
5+
# detection, placeholders, bindings). This workflow makes that surface a
6+
# first-class, deterministic merge gate:
7+
#
8+
# - swift-test : runs the native unit tests (GUI-free logic), which
9+
# directly cover the spec-074 duration-field bugs.
10+
# - settings-parity : fails when the web (fields.ts) and native
11+
# (SettingsCatalog.swift) duration-field catalogs drift.
12+
#
13+
# Both job names are stable so they can be added to branch protection as
14+
# required status checks. See docs/qa-merge-gate.md.
15+
#
16+
# REQUIRED-SAFE DESIGN (MCP-1219)
17+
# -------------------------------
18+
# These jobs are meant to be REQUIRED status checks on `main`. A required check
19+
# that never produces a status blocks every PR that lacks it. GitHub produces
20+
# NO status for a workflow skipped by a top-level `paths:` filter — the check
21+
# stays "Expected — Waiting" forever and blocks the PR. So this workflow must
22+
# trigger on EVERY pull request, then decide internally whether the real work
23+
# runs:
24+
#
25+
# - No top-level `paths:` filter -> the workflow always runs.
26+
# - The `changes` job detects whether native / settings files were touched.
27+
# - swift-test / settings-parity are gated with a job-level `if:`. On a PR
28+
# that touches none of those paths they are SKIPPED, and a skipped job
29+
# reports its required context as satisfied (green) — unlike a skipped
30+
# workflow, which reports nothing and blocks.
31+
#
32+
# Net effect: a native PR runs the real tests; a non-native PR (e.g. a deps
33+
# bump) shows both contexts green without spending a macOS runner. Verify this
34+
# on a non-native PR BEFORE adding the contexts to branch protection
35+
# (docs/qa-merge-gate.md "Activation").
36+
name: Native macOS Tests
37+
38+
on:
39+
pull_request:
40+
push:
41+
branches: [main, next]
42+
43+
permissions:
44+
contents: read
45+
46+
jobs:
47+
changes:
48+
name: detect-native-changes
49+
runs-on: ubuntu-latest
50+
timeout-minutes: 5
51+
outputs:
52+
native: ${{ steps.filter.outputs.native }}
53+
steps:
54+
- uses: actions/checkout@v4
55+
- uses: dorny/paths-filter@v3
56+
id: filter
57+
with:
58+
filters: |
59+
native:
60+
- 'native/macos/**'
61+
- 'frontend/src/views/settings/**'
62+
- 'frontend/src/components/settings/**'
63+
- 'scripts/check-settings-parity.py'
64+
- '.github/workflows/native-tests.yml'
65+
66+
swift-test:
67+
name: swift-test
68+
needs: changes
69+
if: needs.changes.outputs.native == 'true'
70+
runs-on: macos-latest
71+
timeout-minutes: 20
72+
steps:
73+
- uses: actions/checkout@v4
74+
- name: Show Swift toolchain
75+
run: swift --version
76+
- name: Run native unit tests
77+
working-directory: native/macos/MCPProxy
78+
# NOTE: the skip-set below are pre-existing/environmental failures
79+
# tracked for repair (AutoStart UserDefaults first-run, SSE-parser
80+
# edge cases, a JSONEncoder behavior canary). They are unrelated to the
81+
# tray form logic. Remove skips as each is fixed so coverage grows.
82+
# Tracked in the native-test-suite-green follow-up.
83+
run: |
84+
swift test \
85+
--skip AutoStartTests \
86+
--skip testDefaultJSONEncoderDropsOptionalNilFromMap \
87+
--skip testSSEEventDecodeInvalidDataThrows \
88+
--skip testFieldWithColonButNoValue \
89+
--skip testFieldWithNoColon
90+
91+
settings-parity:
92+
name: settings-parity
93+
needs: changes
94+
if: needs.changes.outputs.native == 'true'
95+
runs-on: ubuntu-latest
96+
timeout-minutes: 5
97+
steps:
98+
- uses: actions/checkout@v4
99+
- uses: actions/setup-python@v5
100+
with:
101+
python-version: '3.12'
102+
- name: Web <-> native settings catalog parity
103+
run: python3 scripts/check-settings-parity.py

docs/qa-merge-gate.md

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# Mandatory QA merge gate
2+
3+
Makes feature verification a **required, mechanical** check before a PR can
4+
merge to `main` — closing the class of gap that let MCP-1214 ship (a native
5+
macOS tray bug that Web-UI-only QA never exercised).
6+
7+
The `gh pr merge --admin` escape hatch is intentionally retained
8+
(`enforce_admins` stays `false`) for genuine emergencies; normal merges must be
9+
green.
10+
11+
## Required checks
12+
13+
| Check (status context) | Where it runs | Catches |
14+
|---|---|---|
15+
| `swift-test` | `.github/workflows/native-tests.yml` (macOS) | Native tray form logic — validation, dirty detection, tri-state durations. GUI-free, deterministic. |
16+
| `settings-parity` | `.github/workflows/native-tests.yml` (Linux) | Web (`fields.ts`) ↔ native (`SettingsCatalog.swift`) duration-field drift (placeholder / optional). |
17+
| `qa-gate` | Paperclip QATester → `scripts/post-qa-gate-status.sh` | Full feature QA. `success` only when QATester PASSes at the PR's **current head SHA**. |
18+
19+
`swift-test` and `settings-parity` are ordinary GitHub Actions jobs, but
20+
`native-tests.yml` is deliberately **required-safe**: the workflow has no
21+
top-level `paths:` filter, so it runs on *every* PR. A `changes` job
22+
(`dorny/paths-filter`) detects whether native / settings files were touched,
23+
and the two real jobs are gated with a job-level `if:`. On a PR that touches
24+
none of those paths the jobs are **skipped**, and a skipped job reports its
25+
required context as satisfied (green). This matters because GitHub produces
26+
**no status at all** for a workflow skipped by a top-level `paths:` filter — a
27+
required context that never reports stays "Expected — Waiting" and blocks every
28+
non-native PR forever. See the `REQUIRED-SAFE DESIGN` header in
29+
`native-tests.yml`.
30+
31+
`qa-gate` is a **commit status** the QATester posts at the end of its run
32+
(keyed to the head SHA). Because it is SHA-keyed, any new push lands on a SHA
33+
with no `qa-gate` status → the check returns to pending → QA must re-bless the
34+
new head. This enforces the spec-075 rule ("PASS valid only while PR head ==
35+
qa_head_sha") in the merge button itself.
36+
37+
## Activation (run after the workflow + scripts are on `main`)
38+
39+
> Order matters: a required check that has **never produced a status** shows as
40+
> pending on every open PR and blocks normal merges immediately. Land
41+
> `native-tests.yml` and the scripts on `main` first, then **verify on a
42+
> non-native PR** (e.g. a dependency bump that touches none of the filtered
43+
> paths) that `swift-test` and `settings-parity` both report green (skipped) and
44+
> do **not** block it. Only after that confirmation, add the contexts to branch
45+
> protection.
46+
47+
Current required checks (do not drop them — the API call **replaces** the set):
48+
49+
```
50+
Lint, Unit Tests (ubuntu-latest, 1.25), Build (ubuntu-latest),
51+
Build (macos-latest), Build (windows-latest), Build Frontend,
52+
Validate PR title, Verify OpenAPI Artifacts
53+
```
54+
55+
Add the three new contexts (keeps `enforce_admins: false`):
56+
57+
```bash
58+
gh api -X PATCH repos/:owner/:repo/branches/main/protection/required_status_checks \
59+
-f strict=false \
60+
-f 'contexts[]=Lint' \
61+
-f 'contexts[]=Unit Tests (ubuntu-latest, 1.25)' \
62+
-f 'contexts[]=Build (ubuntu-latest)' \
63+
-f 'contexts[]=Build (macos-latest)' \
64+
-f 'contexts[]=Build (windows-latest)' \
65+
-f 'contexts[]=Build Frontend' \
66+
-f 'contexts[]=Validate PR title' \
67+
-f 'contexts[]=Verify OpenAPI Artifacts' \
68+
-f 'contexts[]=swift-test' \
69+
-f 'contexts[]=settings-parity' \
70+
-f 'contexts[]=qa-gate'
71+
```
72+
73+
Stage `qa-gate` last — only after the QATester is posting it — so open PRs are
74+
not blocked on a status that nobody emits yet. Until then, add just `swift-test`
75+
and `settings-parity` — they report on every PR (green/skipped on non-native
76+
PRs) thanks to the required-safe design above, so they will not strand open PRs.
77+
78+
## QATester contract
79+
80+
The `mcpproxy-qa` skill ("Merge Gate" + "Native macOS Tray Testing" sections)
81+
requires QATester to:
82+
83+
1. Treat every surface implied by the diff as mandatory — `native/macos/**`
84+
means the native tray (swift test green + behavioral assertions), never
85+
waived as "mirrors the frontend".
86+
2. Treat an interactive-surface `cannot_verify` as a **BLOCK**, not a low-risk
87+
pass (the MCP-1214 root cause).
88+
3. Post the gate status for the exact head SHA at the end of the run:
89+
`scripts/post-qa-gate-status.sh "$QA_HEAD_SHA" success|failure "..."`.
90+
91+
## Known follow-up
92+
93+
`native-tests.yml` skips a handful of pre-existing/environmental Swift test
94+
failures (AutoStart UserDefaults first-run, SSE-parser edge cases, a
95+
JSONEncoder behavior canary) so the gate is green today. Green those and remove
96+
the `--skip` flags to widen coverage.

frontend/src/components/settings/SettingField.vue

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,7 @@
130130
:value="modelValue ?? ''"
131131
:placeholder="field.placeholder"
132132
:data-test="`setting-text-${field.key}`"
133-
@input="emitVal(($event.target as HTMLInputElement).value)"
133+
@input="emitText(($event.target as HTMLInputElement).value)"
134134
/>
135135
<span v-if="validationError" class="text-error text-xs mt-1" :data-test="`setting-error-${field.key}`">{{ validationError }}</span>
136136
</div>
@@ -202,6 +202,17 @@ function emitVal(v: any) {
202202
emit('update:modelValue', v)
203203
}
204204
205+
// Text/duration input. A blank optional duration is "unset" — emit null
206+
// (reset to the default) rather than "" which the backend can't parse as a
207+
// duration. This mirrors the tri-state contract (nil = default).
208+
function emitText(raw: string) {
209+
if (props.field.control === 'duration' && props.field.optional && raw.trim() === '') {
210+
emitVal(null)
211+
return
212+
}
213+
emitVal(raw)
214+
}
215+
205216
function onNumber(raw: string) {
206217
if (raw === '') {
207218
emitVal('')

frontend/src/views/settings/fields.ts

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,9 @@ export function validateField(field: SettingField, value: unknown): string | nul
109109
}
110110
if (field.control === 'duration') {
111111
const s = String(value ?? '').trim()
112-
if (s === '') return 'Enter a duration, e.g. 2m'
112+
// An optional duration left blank means "inherit the default" (tri-state
113+
// nil) and is valid; only a required duration must be non-empty.
114+
if (s === '') return field.optional ? null : 'Enter a duration, e.g. 2m'
113115
if (!DURATION_RE.test(s)) return 'Use a duration like 2m, 90s, or 1h30m'
114116
}
115117
if (field.valueKind && (field.control === 'text' || field.control === 'secret')) {
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
import { describe, it, expect } from 'vitest'
2+
import { validateField, type SettingField } from '../../src/views/settings/fields'
3+
4+
// Mirrors the native tray's SettingsDiscoveryFieldsTests: an optional duration
5+
// left blank means "inherit the default" (tri-state nil) and must validate,
6+
// otherwise the Save button is stuck disabled.
7+
describe('validateField — optional duration (spec 074)', () => {
8+
const optional: SettingField = { key: 'tool_discovery_interval', label: 'X', control: 'duration', optional: true }
9+
const required: SettingField = { key: 'x', label: 'X', control: 'duration' }
10+
11+
it('treats a blank optional duration as valid (inherit default)', () => {
12+
expect(validateField(optional, '')).toBeNull()
13+
expect(validateField(optional, null)).toBeNull()
14+
expect(validateField(optional, undefined)).toBeNull()
15+
})
16+
17+
it('still rejects a blank required duration', () => {
18+
expect(validateField(required, '')).not.toBeNull()
19+
})
20+
21+
it('rejects a malformed duration regardless of optional', () => {
22+
expect(validateField(optional, 'abc')).not.toBeNull()
23+
})
24+
25+
it('accepts valid durations including 0s (disabled)', () => {
26+
expect(validateField(optional, '30s')).toBeNull()
27+
expect(validateField(optional, '1h30m')).toBeNull()
28+
expect(validateField(optional, '0s')).toBeNull()
29+
})
30+
})

native/macos/MCPProxy/MCPProxy/Settings/ConfigSettingsView.swift

Lines changed: 30 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,18 @@ final class ConfigStore: ObservableObject {
118118
)
119119
}
120120

121+
/// Binding for an *optional* scalar field (tri-state durations). Reads
122+
/// nil/NSNull as an empty field; writes a blank value back as "unset"
123+
/// (nil → NSNull → JSON null = reset to default) instead of an empty
124+
/// string, so the field neither reads as dirty when absent nor sends an
125+
/// unparseable "" to the backend.
126+
func optionalStringBinding(_ key: String) -> Binding<String> {
127+
Binding(
128+
get: { coerceString(self.value(key)) },
129+
set: { self.setValue(key, optionalScalarStored($0)) }
130+
)
131+
}
132+
121133
func doubleBinding(_ key: String) -> Binding<Double> {
122134
Binding(
123135
get: { coerceDouble(self.value(key)) ?? 0 },
@@ -154,8 +166,18 @@ func coerceDouble(_ v: Any?) -> Double? {
154166
default: return nil
155167
}
156168
}
169+
/// True for the values that all mean "no value": nil (absent key), NSNull
170+
/// (explicit null) and an empty/whitespace-only string (the text-binding
171+
/// round-trip of a blank field). Treating these as one class keeps an
172+
/// untouched optional field from reading as an unsaved change.
173+
func isBlankValue(_ v: Any?) -> Bool {
174+
if v == nil || v is NSNull { return true }
175+
if let s = v as? String { return s.trimmingCharacters(in: .whitespaces).isEmpty }
176+
return false
177+
}
178+
157179
func valuesEqual(_ a: Any?, _ b: Any?) -> Bool {
158-
if a == nil && b == nil { return true }
180+
if isBlankValue(a) && isBlankValue(b) { return true }
159181
guard let a = a as? NSObject, let b = b as? NSObject else { return false }
160182
return a.isEqual(b)
161183
}
@@ -322,7 +344,13 @@ struct ConfigFieldRow: View {
322344
.multilineTextAlignment(.trailing).frame(width: 100)
323345
.textFieldStyle(.roundedBorder)
324346
case .text, .duration:
325-
TextField(field.control == .duration ? "2m" : "", text: store.stringBinding(field.key))
347+
// Duration fields are tri-state: an optional one stores a blank
348+
// value as "unset" (reset to default) via optionalStringBinding.
349+
// The placeholder is the field's real default (e.g. 30s / 5m).
350+
TextField(field.placeholder ?? "",
351+
text: (field.control == .duration && field.optional)
352+
? store.optionalStringBinding(field.key)
353+
: store.stringBinding(field.key))
326354
.frame(width: 200).textFieldStyle(.roundedBorder).font(.system(.body, design: .monospaced))
327355
case .secret:
328356
HStack(spacing: 4) {

native/macos/MCPProxy/MCPProxy/Settings/SettingsCatalog.swift

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,10 @@ struct ConfigField: Identifiable {
2626
var docs: String? = nil // doc path on docs.mcpproxy.app
2727
var valueKind: ConfigValueKind? = nil
2828
var optional: Bool = false
29+
// Greyed hint shown when the field is empty. For optional duration fields
30+
// this is the real default (e.g. "30s"), so a blank field reads as
31+
// "inherit the default" rather than a generic example.
32+
var placeholder: String? = nil
2933
// Danger confirm: present a confirm dialog before saving. For toggles,
3034
// only when the new value == dangerConfirmValue (nil = always confirm).
3135
var dangerMessage: String? = nil
@@ -158,7 +162,8 @@ enum SettingsCatalog {
158162
key: "call_tool_timeout",
159163
label: "Tool call timeout",
160164
help: "How long to wait for a single tool call before giving up. e.g. 2m, 90s, 30s.",
161-
control: .duration
165+
control: .duration,
166+
placeholder: "2m"
162167
),
163168
ConfigField(
164169
key: "logging.level",
@@ -262,8 +267,8 @@ enum SettingsCatalog {
262267
title: "Tool discovery & health checks",
263268
help: "How often mcpproxy probes upstream servers for liveness and re-discovers their tools. Lower these to reduce background traffic to chatty servers.",
264269
fields: [
265-
ConfigField(key: "health_check_interval", label: "Health-check interval", help: "How often to send a lightweight liveness ping to each connected server. \"0s\" disables the periodic probe (a dead server is then detected lazily on the next tool call). Range: 5s\u{2013}1h. Default 30s. Does not apply to Docker-isolated servers \u{2014} their liveness is monitored at the container level.", control: .duration, optional: true),
266-
ConfigField(key: "tool_discovery_interval", label: "Tool-discovery interval", help: "How often to re-list every server\u{2019}s tools to rebuild the search index. \"0s\" disables the periodic sweep \u{2014} tool changes are then picked up only at connect time and via tools/list_changed push notifications. Range: 30s\u{2013}24h. Default 5m.", control: .duration, optional: true),
270+
ConfigField(key: "health_check_interval", label: "Health-check interval", help: "How often to send a lightweight liveness ping to each connected server. \"0s\" disables the periodic probe (a dead server is then detected lazily on the next tool call). Range: 5s\u{2013}1h. Default 30s. Does not apply to Docker-isolated servers \u{2014} their liveness is monitored at the container level.", control: .duration, optional: true, placeholder: "30s"),
271+
ConfigField(key: "tool_discovery_interval", label: "Tool-discovery interval", help: "How often to re-list every server\u{2019}s tools to rebuild the search index. \"0s\" disables the periodic sweep \u{2014} tool changes are then picked up only at connect time and via tools/list_changed push notifications. Range: 30s\u{2013}24h. Default 5m.", control: .duration, optional: true, placeholder: "5m"),
267272
]
268273
),
269274
ConfigSection(
@@ -337,6 +342,17 @@ func configSet(_ obj: inout [String: Any], _ path: String, _ value: Any?) {
337342
assign(&obj, keys[...])
338343
}
339344

345+
/// Maps a text-field string for an *optional* scalar field (e.g. a tri-state
346+
/// duration) to its stored value: a blank string means "unset" — returned as
347+
/// nil so `configSet` writes NSNull → the PATCH body carries a literal JSON
348+
/// `null`, which resets the field to its built-in default. Any other string is
349+
/// stored verbatim. This keeps a blank optional field from being sent as ""
350+
/// (which the backend can't parse as a duration) and from reading as dirty
351+
/// against an absent key.
352+
func optionalScalarStored(_ s: String) -> Any? {
353+
s.trimmingCharacters(in: .whitespaces).isEmpty ? nil : s
354+
}
355+
340356
/// Assembles a nested object containing ONLY the given dot-path keys, read from
341357
/// `source`. This is the partial payload sent to PATCH /config.
342358
func buildPartial(_ source: [String: Any], _ keys: [String]) -> [String: Any] {
@@ -432,7 +448,10 @@ func validateConfigField(_ field: ConfigField, _ value: Any?) -> String? {
432448

433449
if field.control == .duration {
434450
let s = stringValue(value)
435-
if s.isEmpty { return "Enter a duration, e.g. 2m" }
451+
// An optional duration left blank means "inherit the default"
452+
// (tri-state nil): valid, and saved as JSON null. Only a required
453+
// duration must be non-empty.
454+
if s.isEmpty { return field.optional ? nil : "Enter a duration, e.g. 2m" }
436455
if !matches(durationRegex, s) { return "Use a duration like 2m, 90s, or 1h30m" }
437456
return nil
438457
}

0 commit comments

Comments
 (0)