Add runtime settings and discovery surfaces by Keith-CY · Pull Request #663 · Keith-CY/melix

Keith-CY · 2026-05-11T07:35:29Z

Summary

Add machine-readable runtime settings and discovery surfaces for CLI and the local HTTP gateway.
Persist user runtime settings under MELIX_HOME/~/.melix, add project settings overrides, and report setting sources and probe timings.
Expose discovery payloads for info, capabilities, instructions, schema, config metadata, local paths, update receipts, and same-family Qwen3.5 alias suggestions.
Add local HTTP discovery endpoints at /.well-known/melix.json, /api/capabilities, /api/instructions, and /api/config-metadata.

Closes #641.

Plan or Spec

docs/plans/2026-05-11-runtime-settings-discovery-surfaces.md

Commands Run

swift test --filter 'MelixCLIParserTests/parsesRuntimeSettingsAndDiscoveryCommands|MelixCLIParserTests/rejectsMalformedRuntimeSettingsAndDiscoveryCommands|MelixCLIRunnerTests/settingsShowResolvesPrecedenceAndReportsSourceMetadata|MelixCLIRunnerTests/settingsSetValidateAndResetMutateOnlyUserSettingsFile|MelixCLIRunnerTests/settingsValidationReportsInvalidDocumentsKeysAndValues|MelixCLIRunnerTests/settingsMutationCommandsRejectUnknownKeysInvalidValuesAndMalformedStores|MelixCLIRunnerTests/infoDiscoveryReadsLocalUpdateChannelReceiptsWithoutNetwork|MelixCLIRunnerTests/discoveryCommandsExposeMachineReadablePayloads'
# Passed: 8 Swift Testing tests.

swift test --package-path services/control-plane-swift --filter 'OpenAIHandlerTests/getDiscoveryEndpointsExposeMachineReadableLocalRuntimeContracts|OpenAIHandlerTests/runtimeDiscoveryContractsExposeStableAliasesLinksMetadataAndOnboardingEndpoints|OpenAIHandlerTests/getCapabilitiesDiscoveryRendersAllModelResidencyStates'
# Passed: 3 Swift Testing tests.

swift test --enable-code-coverage --filter 'MelixCLIParserTests|MelixCLIRunnerTests/settingsShowResolvesPrecedenceAndReportsSourceMetadata|MelixCLIRunnerTests/settingsSetValidateAndResetMutateOnlyUserSettingsFile|MelixCLIRunnerTests/settingsValidationReportsInvalidDocumentsKeysAndValues|MelixCLIRunnerTests/settingsMutationCommandsRejectUnknownKeysInvalidValuesAndMalformedStores|MelixCLIRunnerTests/infoDiscoveryReadsLocalUpdateChannelReceiptsWithoutNetwork|MelixCLIRunnerTests/discoveryCommandsExposeMachineReadablePayloads'
# Passed: 86 Swift Testing tests.

swift test --package-path services/control-plane-swift --enable-code-coverage --filter 'OpenAIHandlerTests/getDiscoveryEndpointsExposeMachineReadableLocalRuntimeContracts|OpenAIHandlerTests/runtimeDiscoveryContractsExposeStableAliasesLinksMetadataAndOnboardingEndpoints|OpenAIHandlerTests/getCapabilitiesDiscoveryRendersAllModelResidencyStates|OpenAIHandlerTests/getHealthReportsOkWhenAllRoutesAreReadyAndPinnedModelsCountAsReady|OpenAIHandlerTests/getHealthReportsMissingRouteClientsAsFalseWhenARegistryIsPresent'
# Passed: 5 Swift Testing tests.

git diff --check
# Passed.

python3 scripts/swift_changed_line_coverage.py --binary .build/arm64-apple-macosx/debug/melixPackageTests.xctest/Contents/MacOS/melixPackageTests --profdata .build/arm64-apple-macosx/debug/codecov/default.profdata --diff-from origin/main Sources/MelixCLICore/MelixCLI.swift Sources/MelixCLICore/MelixCLICommandCodec.swift Sources/MelixCLICore/MelixHome.swift Sources/MelixCLICore/MelixRuntimeDiscovery.swift tests/MelixCLITests/MelixCLIParserTests.swift tests/MelixCLITests/MelixCLIRunnerTests.swift
# Passed: TOTAL 95.51% (893/935).

python3 scripts/swift_changed_line_coverage.py --binary services/control-plane-swift/.build/arm64-apple-macosx/debug/MelixControlPlanePackageTests.xctest/Contents/MacOS/MelixControlPlanePackageTests --profdata services/control-plane-swift/.build/arm64-apple-macosx/debug/codecov/default.profdata --diff-from origin/main services/control-plane-swift/Sources/HTTPGateway/APIOnboardingSnapshotSource.swift services/control-plane-swift/Sources/HTTPGateway/OpenAI/OpenAIHandler.swift services/control-plane-swift/Sources/HTTPGateway/RuntimeDiscoveryPayloads.swift services/control-plane-swift/Sources/Support/RuntimeDiscoveryContracts.swift services/control-plane-swift/Tests/HTTPGatewayTests/OpenAIHandlerTests.swift
# Passed: TOTAL 99.82% (550/551).

Coverage and Metrics

CLI changed-line coverage: 95.51% (893/935).
Control-plane changed-line coverage: 99.82% (550/551).
Added CLI probe metrics:
- settings_resolve_ms
- settings_write_ms
- settings_validate_ms
- discovery_build_ms
Added HTTP discovery latency metrics:
- operator.discovery_well_known_latency_ms
- operator.discovery_capabilities_latency_ms
- operator.discovery_instructions_latency_ms
- operator.discovery_config_metadata_latency_ms

Known Gaps

Full make bootstrap, make proto, make py-test, and make integration-test were not run; this change is scoped to Swift CLI/control-plane discovery surfaces and does not change protobuf schemas or Python code.

Evidence Checklist

Relevant plan or spec is identified.
Behavior changes are reflected in the relevant docs.
Protocol changes include regenerated generated artifacts.
Dependency changes include updated lockfiles.
Relevant tests were run.
A metrics report is included, or N/A is stated explicitly with the reason.
Deferred work and known gaps are stated explicitly.

gemini-code-assist

Code Review

This pull request introduces a comprehensive machine-readable contract for runtime settings and discovery surfaces across the Melix CLI and HTTP gateway. It implements a hierarchical settings store with a defined precedence (CLI flags, environment variables, project-level, and user-level settings) and adds several discovery endpoints, including info, capabilities, instructions, and configuration metadata. The review feedback identifies several areas for improvement: hardening the manual CLI argument parsing logic to prevent incorrect token consumption, using more reliable methods for determining the current working directory, preventing silent truncation during integer type coercion, refining version string detection in pyproject.toml, and utilizing monotonic clocks for accurate latency measurements.

gemini-code-assist · 2026-05-11T07:40:41Z

+                } else if token.hasPrefix("--") {
+                    optionArguments.append(token)
+                    let valueIndex = index + 1
+                    guard valueIndex < tail.count else {
+                        throw MelixCLIError.missingValue(token)
+                    }
+                    optionArguments.append(tail[valueIndex])
+                    index += 1


This manual flag separation logic is fragile. It assumes that any flag starting with -- (except for the explicitly handled --json) always takes exactly one argument. If an unknown boolean flag is provided, or if a flag is misspelled (e.g., --jsno), this logic will incorrectly consume the next token as its value, which often results in confusing 'missing required KEY/VALUE' errors because a positional argument was stolen. Consider using a more robust argument parsing strategy or explicitly checking against a set of known flags that require values.

gemini-code-assist · 2026-05-11T07:40:41Z

+    }
+
+    private func projectRootURL() -> URL {
+        let rawPath = environment["PWD"]?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""


Using the PWD environment variable to determine the current working directory can be unreliable, as it may not be synchronized with the process's actual working directory if it has changed. FileManager.default.currentDirectoryPath is the standard and more reliable source of truth for the process's current directory in Swift.

Suggested change

let rawPath = environment["PWD"]?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""

let rawPath = fileManager.currentDirectoryPath

gemini-code-assist · 2026-05-11T07:40:41Z

+    private func coerce(_ rawValue: Any, definition: MelixRuntimeSettingMetadata) throws -> Any {
+        switch definition.valueType {
+        case "int":
+            if let number = rawValue as? NSNumber {
+                return NSNumber(value: number.intValue)
+            }
+            if let value = rawValue as? Int {
+                return NSNumber(value: value)
+            }
+            if let string = rawValue as? String, let parsed = Int(string) {
+                return NSNumber(value: parsed)
+            }


The coercion logic for int types silently truncates floating-point numbers (e.g., 2.9 becomes 2). This can lead to unexpected behavior if a user provides a non-integer value for a setting that strictly expects an integer. It is safer to validate that the number is actually an integer before coercing it.

Suggested change

private func coerce(_ rawValue: Any, definition: MelixRuntimeSettingMetadata) throws -> Any {

switch definition.valueType {

case "int":

if let number = rawValue as? NSNumber {

return NSNumber(value: number.intValue)

}

if let value = rawValue as? Int {

return NSNumber(value: value)

}

if let string = rawValue as? String, let parsed = Int(string) {

return NSNumber(value: parsed)

}

private func coerce(_ rawValue: Any, definition: MelixRuntimeSettingMetadata) throws -> Any {

switch definition.valueType {

case "int":

if let number = rawValue as? NSNumber {

let doubleValue = number.doubleValue

guard doubleValue == floor(doubleValue) else {

break

}

return NSNumber(value: number.intValue)

}

if let value = rawValue as? Int {

return NSNumber(value: value)

}

if let string = rawValue as? String, let parsed = Int(string) {

return NSNumber(value: parsed)

}

gemini-code-assist · 2026-05-11T07:40:41Z

+        if let text = try? String(contentsOf: pyprojectURL, encoding: .utf8) {
+            for line in text.split(separator: "\n") {
+                let trimmed = line.trimmingCharacters(in: .whitespacesAndNewlines)
+                guard trimmed.hasPrefix("version") else {


The check trimmed.hasPrefix("version") is too broad and will match keys like version-control or versioning in pyproject.toml. It should be refined to ensure it only matches the specific version key, typically by checking for a following space or equals sign.

Suggested change

guard trimmed.hasPrefix("version") else {

guard trimmed.hasPrefix("version ") || trimmed.hasPrefix("version=") else {

gemini-code-assist · 2026-05-11T07:40:41Z

    }

+    private func handleDiscoveryWellKnown() async throws -> HTTPResponse {
+        let startedAt = Date()


Using Date() for latency measurements is discouraged because wall-clock time is not monotonic and can jump due to system clock adjustments (e.g., NTP synchronization). For accurate duration measurements, use ContinuousClock (available in Swift 5.7+) or DispatchTime.

github-actions · 2026-05-11T07:52:12Z

Melix PR Scoped Performance Report

Status: ok
Changed files: 13
Selected probes: 1
Direct/gated probes: 1
Regressions: 0
Context regressions: 0
Verification failures: 0

Changed Files

Sources/MelixCLICore/MelixCLI.swift
Sources/MelixCLICore/MelixCLICommandCodec.swift
Sources/MelixCLICore/MelixHome.swift
Sources/MelixCLICore/MelixRuntimeDiscovery.swift
apps/macos-menubar/Tests/MenuBarTests/Phase8LoRAWindowSmokeTests.swift
docs/plans/2026-05-11-runtime-settings-discovery-surfaces.md
services/control-plane-swift/Sources/HTTPGateway/APIOnboardingSnapshotSource.swift
services/control-plane-swift/Sources/HTTPGateway/OpenAI/OpenAIHandler.swift
services/control-plane-swift/Sources/HTTPGateway/RuntimeDiscoveryPayloads.swift
services/control-plane-swift/Sources/Support/RuntimeDiscoveryContracts.swift
services/control-plane-swift/Tests/HTTPGatewayTests/OpenAIHandlerTests.swift
tests/MelixCLITests/MelixCLIParserTests.swift
tests/MelixCLITests/MelixCLIRunnerTests.swift

Swift CLI JSON envelope encoding

Status: ok
Gate: direct
Targeted tests: pass
Coverage: pass

Metric	Base	Head	Delta	Status
elapsed_ms_mean	372072.538	42654.221	-329418.317 (-88.54%)	improvement

Zigfreidish

Review: Add runtime settings and discovery surfaces

The overall design is solid — a curated registry with explicit precedence (CLI flag → env → project → user → default), stable schema versions, and thorough test coverage for parsing, mutation, and HTTP discovery. Two correctness issues need to be addressed before merging.

Issue 1 — Hardcoded version in HTTP discovery (`RuntimeDiscoveryPayloads.swift`)

wellKnownPayload() returns "version": "0.0.0-dev" unconditionally, while the CLI infoPayload() calls installedVersion() which reads pyproject.toml. Consumers of /.well-known/melix.json will always see 0.0.0-dev regardless of the actual installed version. Either expose installedVersion() to HTTPRuntimeDiscoveryPayloads (or pass a version string at construction time), or omit the version key from the HTTP well-known payload if it can't be resolved reliably in that context.

Issue 2 — Dead ternary branches in the runner (`MelixCLI.swift`)

Several case blocks in the runner have ternary expressions where both branches are identical, so the json flag has no effect on the non-JSON path:

// settingsShow
return options.json ? try prettyJSON(payload) : try prettyJSON(payload)

// info, capabilities, instructions, schema, configMetadata — same pattern

settingsSet, settingsValidate, and settingsReset do have distinct branches (human-readable text vs JSON), so the intent is clearly there. The discovery commands need either a human-readable fallback or the guards in the parsers (which already enforce --json for most commands) should be relied upon and the dead branch removed.

Minor observations (non-blocking)

writeSettingsDocument first-run: melixHome.writeAtomically(_:to:) will fail if ~/.melix/ doesn't exist yet on a fresh install. If MelixHome initialization guarantees the directory exists this is fine — worth confirming.
projectRootURL() fallback: Falls back to FileManager.default.currentDirectoryPath when PWD is unset. This is standard practice but worth noting that in sandboxed or non-TTY contexts currentDirectoryPath may not match the user's working directory.
_parse_relaxed_object_arguments comma limitation (also noted in #868): values containing commas will be misparse. Acceptable for a fallback path; a doc comment or test documenting the known limitation would help.

What's good

Settings precedence logic and source metadata are clean and deterministic.
validate() is non-throwing and aggregates all errors rather than failing fast — good UX for misconfigured environments.
Model alias discovery correctly passes through local paths and full owner/model IDs without clobbering them.
HTTP discovery endpoints are correctly added to the authorizationRoute health bypass so they don't require auth.
Test coverage is comprehensive across parser, runner, and HTTP handler layers.

Please fix the two issues above and this is ready to merge.

Generated by Claude Code

Zigfreidish

The architecture is solid — stable schema versions, clear precedence chain (CLI flag → env → project → user → default), atomic writes, and thorough test coverage (95%+ changed-line coverage on both packages). Two things to address before merge:

Dead-code ternaries (MelixCLI.swift): every new runner case has options.json ? prettyJSON(payload) : prettyJSON(payload) — both branches are the same call. See inline comment.
Hardcoded version in HTTP well-known payload (RuntimeDiscoveryPayloads.swift): the gateway always returns "0.0.0-dev" while the CLI correctly reads pyproject.toml. Any client that checks the version via /.well-known/melix.json will always see a dev version. See inline comment.

Generated by Claude Code

Zigfreidish · 2026-05-11T21:06:29Z

Both branches of the ternary are identical — plain-text path is dead code.

Every new discovery/settings runner case does:

return options.json ? try prettyJSON(payload) : try prettyJSON(payload)

Both arms call prettyJSON, so the options.json flag has no effect on output format. This is misleading since it implies a human-readable non-JSON path exists. Since the CLI parsers already require --json for most of these commands (throwing a usage error otherwise), the simplest fix is to drop the ternary and call try prettyJSON(payload) unconditionally. For the commands where --json is optional (settingsSet, settingsValidate, settingsReset), consider wiring up a concise text rendering in the else branch, or document that JSON-only output is the intended contract.

Generated by Claude Code

Zigfreidish · 2026-05-11T21:06:41Z

version is hardcoded as "0.0.0-dev" in the HTTP well-known payload.

"version": "0.0.0-dev",

The CLI MelixRuntimeDiscoveryBuilder.infoPayload() reads the real version from pyproject.toml via installedVersion(). The HTTP gateway always reports 0.0.0-dev, making /.well-known/melix.json unreliable for any client that checks the installed version.

MelixRuntimeDiscoveryBuilder already encapsulates this logic; the simplest fix is to expose its infoPayload() from the HTTP layer too (or factor installedVersion() into MelixRuntimeDiscoveryContracts so both callers share it).

Generated by Claude Code

gemini-code-assist Bot reviewed May 11, 2026

View reviewed changes

Zigfreidish requested changes May 11, 2026

View reviewed changes

Keith-CY and others added 4 commits May 12, 2026 23:21

Add runtime settings discovery surfaces

3ded76d

Address runtime discovery review feedback

5615dd9

Stabilize phase 8 LoRA smoke export fixture

75329a0

Isolate phase 8 LoRA smoke evaluation export

842c40c

Zigfreidish force-pushed the codex/issue-641-runtime-discovery branch from 19a5ddf to 842c40c Compare May 12, 2026 15:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add runtime settings and discovery surfaces#663

Add runtime settings and discovery surfaces#663
Keith-CY wants to merge 4 commits into
mainfrom
codex/issue-641-runtime-discovery

Keith-CY commented May 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026 •

edited

Loading

Uh oh!

Zigfreidish left a comment

Uh oh!

Zigfreidish left a comment

Uh oh!

Zigfreidish May 11, 2026

Uh oh!

Zigfreidish May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	let rawPath = environment["PWD"]?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
	let rawPath = fileManager.currentDirectoryPath

	guard trimmed.hasPrefix("version") else {
	guard trimmed.hasPrefix("version ") \|\| trimmed.hasPrefix("version=") else {

Conversation

Keith-CY commented May 11, 2026

Summary

Plan or Spec

Commands Run

Coverage and Metrics

Known Gaps

Evidence Checklist

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Melix PR Scoped Performance Report

Changed Files

Swift CLI JSON envelope encoding

Uh oh!

Zigfreidish left a comment

Choose a reason for hiding this comment

Review: Add runtime settings and discovery surfaces

Issue 1 — Hardcoded version in HTTP discovery (RuntimeDiscoveryPayloads.swift)

Issue 2 — Dead ternary branches in the runner (MelixCLI.swift)

Minor observations (non-blocking)

What's good

Uh oh!

Zigfreidish left a comment

Choose a reason for hiding this comment

Uh oh!

Zigfreidish May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Zigfreidish May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented May 11, 2026 •

edited

Loading

Issue 1 — Hardcoded version in HTTP discovery (`RuntimeDiscoveryPayloads.swift`)

Issue 2 — Dead ternary branches in the runner (`MelixCLI.swift`)