Skip to content

feat(config): layered config tool filter — config_denied over user toggles#468

Merged
Dumbris merged 12 commits into
smart-mcp-proxy:mainfrom
nlaurance:feat/config-tool-allowlist
May 18, 2026
Merged

feat(config): layered config tool filter — config_denied over user toggles#468
Dumbris merged 12 commits into
smart-mcp-proxy:mainfrom
nlaurance:feat/config-tool-allowlist

Conversation

@nlaurance
Copy link
Copy Markdown
Contributor

@nlaurance nlaurance commented May 14, 2026

Summary

  • Replaces the BBolt-mutating applyConfigToolFilter (which overwrote user-owned Disabled flags on every reconnect) with a pure evaluation-time config layer
  • enabled_tools / disabled_tools in mcp_config.json are now enforced without touching user preferences, emitting spurious audit events, or silently re-enabling tools the user manually disabled

What changed

Backend

  • ServerConfig.IsToolAllowedByConfig(toolName) — pure config helper
  • Runtime.IsToolConfigDenied(serverName, toolName) — evaluation-time check, no BBolt writes
  • isToolCallable (MCP enforcement chokepoint) checks config layer before BBolt record.Disabled — config-denied tools are hard off across all paths (call_tool_*, retrieve_tools, upstream_servers tool counts)
  • SetAllToolsEnabled skips config-denied tools so bulk Enable All never sets a misleading Disabled=false on a hard-off tool
  • GET /api/v1/servers/{id}/tools exposes config_denied: true per tool
  • POST /api/v1/servers/{id}/tools/{tool}/enabled returns HTTP 409 when trying to enable a config-denied tool
  • UpstreamRecord now persists EnabledTools / DisabledTools through BBolt round-trips (bug fix)

Frontend

  • Tools with config_denied: true show a locked by config badge and a read-only label instead of a toggle
  • Quarantine panel hides the Approve button for config-denied pending/changed tools (shows a lock badge instead)
  • Tool interface in api.ts now properly types disabled, config_denied, approval_status, etc.

Effective visibility model

callable = configAllows(tool) AND userEnabled(tool)

Config can only restrict further — it can never force a tool back on against the user.

Test Plan

  • Unit tests: go test ./internal/config/... ./internal/runtime/... ./internal/server/... ./internal/httpapi/... ./internal/storage/... — all pass
  • Frontend: cd frontend && npm run build — clean build
  • Config with enabled_tools: ["list_issues"]create_issue unreachable via MCP, shows locked in UI
  • Config with disabled_tools: ["delete_repo"]delete_repo unreachable, locked in UI
  • User disabling a tool in the UI survives server reconnect (no longer reset by config)
  • POST /api/v1/servers/{id}/tools/delete_repo/enabled with {"enabled": true} → 409 Conflict
  • Pending/changed quarantine tool that is also config-denied shows lock badge, not Approve button

🤖 Generated with Claude Code

@Dumbris
Copy link
Copy Markdown
Member

Dumbris commented May 15, 2026

Thanks @nlaurance — clean implementation and genuinely thorough tests. I support the idea of config-based tool filtering — declaring tool visibility in mcp_config.json is the right capability for fresh installs and gitops setups, and the allowlist's fail-closed posture against tool-surface growth is exactly in our threat model.

My concern is purely the interaction model with the per-tool enable/disable that recently landed (#463 Web UI, #447, REST/CLI SetToolEnabled). applyConfigToolFilter writes config intent into the same ToolApprovalRecord.Disabled BBolt flag those paths own, on every applyDifferentialToolUpdate (i.e. every reconnect). Concretely this means:

  • There's no durable provenanceToolApprovalRecord has no source field, so after the first write a config-disabled tool is indistinguishable from a user-disabled one. updatedBy="config" is dropped.
  • A tool the user manually disabled in the UI that's in enabled_tools gets silently re-enabled on every reconnectTestApplyConfigToolFilter_EnabledTools_ReEnablesTool pins this as intended, but that's the behavior I'd push back on.
  • It emits a tool_disabled activity event on every reconnect, polluting the audit trail.
  • A config-denied tool keeps its pending/changed quarantine status, so upstream inspect/approve still invites the user to approve a tool config will keep hidden.

Definition I'd like us to agree on

For me, config-based filtering should mean hard-off for unwanted tools, and for that to be solid it has to be transparent, predictable, and surfaced in the UI and CLI — never a silent toggle that snaps back. Precisely:

  • Effective visibility = configAllows(tool) AND userEnabled(tool). Config can only restrict further; it can never force a tool back on against the user.
  • A tool denied by config (in disabled_tools, or absent from a non-empty enabled_tools allowlist) is HARD OFF.
  • Config-denied tools are shown in the Web UI and CLI as locked + badged ("disabled by mcp_config.json") — toggle is read-only, and the server rejects an override attempt from a stale client.
  • A config-denied tool that is also pending/changed in quarantine shows the config lock, not an "approve me" affordance.
  • Config-allowed tools remain normally user-toggleable.

Spec use cases

  1. disabled_tools: ["delete_repo"]delete_repo HARD OFF, UI shows it locked+badged, upstream inspect shows SOURCE=config. User cannot enable it.
  2. enabled_tools: ["list_issues"], server also exposes create_issuecreate_issue is HARD OFF (implicit deny), locked+badged. list_issues is a normal toggle the user may still disable.
  3. User disables list_issues (in enabled_tools) via the UI → it stays disabled across reconnects; config does not re-enable it.
  4. Operator has an allowlist set and the upstream server adds new tools → new tools are denied by default, no manual action needed.
  5. A pending tool that's also config-denied → shown as config-locked, not surfaced for approval.

Implementation-wise this means config should override the stored DB enable state at evaluation time rather than overwriting it: evaluate the config layer in isToolCallable (the single chokepoint that already covers all enforcement paths + code_execution), keep record.Disabled exclusively user-owned, expose a derived config_managed field on the tools API so the UI/CLI can lock+badge, and stop emitting per-reconnect audit events.

Does this definition work for you? If we're aligned on it, I'm happy to pair on this PR and convert the current behavior to override the DB option after config read along these lines — the config fields + mutual-exclusion validation you've already written are solid and stay as-is.

@nlaurance nlaurance changed the title feat(config): enabled_tools/disabled_tools per-server allowlist/denylist in mcp_config.json feat(config): layered config tool filter — config_denied over user toggles May 18, 2026
@nlaurance
Copy link
Copy Markdown
Contributor Author

now rebased with changes suggested

nlaurance and others added 11 commits May 18, 2026 10:50
…enylist

Adds two mutually exclusive fields to ServerConfig that let operators
declare tool visibility statically in mcp_config.json rather than
having to call the API or CLI after every fresh install.

  enabled_tools: ["list_issues", "get_issue"]   // allowlist — only these visible
  disabled_tools: ["delete_repo", "force_push"]  // denylist  — hide these, allow rest

Config validation rejects a server that has both fields set.

On every applyDifferentialToolUpdate (server connect / tool refresh),
applyConfigToolFilter walks the in-memory config, computes the desired
enabled/disabled state for each discovered tool, and calls
setToolEnabledNoEmit to persist it in BBolt. All existing enforcement
paths (isToolCallable, retrieve_tools pre-ranking, call_tool_*) pick
up the change automatically with no further modifications.

Five unit tests cover: allowlist disables unlisted tools, allowlist
re-enables a tool moved back into the list, denylist disables listed
tools, no-op when neither field is set, and end-to-end integration
through applyDifferentialToolUpdate.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lConfigDenied

Replace the BBolt-writing applyConfigToolFilter function (which overwrote
user preferences on every reconnect, emitted spurious audit events, and had
no provenance) with an evaluation-time IsToolConfigDenied method that:

- Reads config at call time, never writes to BBolt
- Preserves user-set tool preferences and audit trail
- Enables separation of concerns: config vs user intent

Key changes:
- Delete applyConfigToolFilter from tool_quarantine.go (63 lines removed)
- Add IsToolConfigDenied(serverName, toolName string) bool to Runtime
- Remove applyConfigToolFilter call from lifecycle.go
- Rewrite tool_config_filter_test.go: 5 new tests for IsToolConfigDenied
  * AllowList: tools not listed are denied
  * DenyList: listed tools are denied
  * NoFilter: all tools allowed when config has no filter
  * UnknownServer: returns false for missing servers
  * UserDisabledPreserved: BBolt state is independent from config layer

All 198 runtime tests pass. No behavior change to actual tool visibility—
the config layer is now just evaluated at call time instead of persisted.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add IsToolConfigDenied delegation on *Server and insert a config-layer
check in isToolCallable so tools denied by enabled_tools/disabled_tools
in the server config are hard-off at MCP call time, evaluated at
runtime without touching BBolt storage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…fig-denied tools

- Add ConfigDenied bool field to contracts.Tool (json: config_denied,omitempty)
- Enrich config_denied in handleGetServerTools via IsToolConfigDenied interface
- Return HTTP 409 in handleSetToolEnabled when req.Enabled is true for a config-denied tool
- Remove debug fmt.Printf lines from enrichment loop; use logger.Debug instead

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extended the Tool interface to include runtime fields that the backend
sends and Vue components use: server_name, schema, usage, last_used,
approval_status, disabled, and config_denied. This allows proper typing
of these fields in the frontend instead of using unsafe casts.

Simplified type assertions in isToolConfigDenied and isToolEnabled
functions to use the properly-typed Tool interface directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nlaurance nlaurance force-pushed the feat/config-tool-allowlist branch from d349489 to 3f24ce3 Compare May 18, 2026 09:52
…mcp-proxy#468 smart-mcp-proxy#1-smart-mcp-proxy#4)

smart-mcp-proxy#1 (bug): config-denied tools returned the generic 'Tool is disabled
and not callable' over MCP — identical to a user-toggled tool, but the
remediation differs (edit mcp_config.json vs a UI toggle that 409s).
blockedToolMessage() now branches on IsToolConfigDenied and tells the
agent it is operator policy, not user-overridable. Split into a pure
blockedToolMessageFor(bool) with a dedicated unit test.

smart-mcp-proxy#2: config-lock badges badge-error (red/alarming) -> badge-neutral + 🔒;
a policy lock is not an error.
smart-mcp-proxy#3: 'Enable All' toast now reports 'N tools remain locked by config'
(client-side from config_denied; no API contract change).
smart-mcp-proxy#4: unified copy — stray 'config locked' label -> '🔒 locked by config'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@Dumbris Dumbris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Layered config tool filter is the right design (config restricts, never force-on; user prefs preserved across reconnect). Added the 4 UX-review fixes in 7aa683b: status-aware TOOL_BLOCKED message (config-policy vs user-disable), neutral lock badges, Enable-All skip feedback, copy consistency. All CI green; local config/runtime/storage/httpapi + server blocked-tool unit tests pass. Follow-up (agent-discoverable disabled tools) tracked as spec 049.

@Dumbris Dumbris merged commit 3ffa41e into smart-mcp-proxy:main May 18, 2026
23 checks passed
Dumbris added a commit that referenced this pull request May 18, 2026
…er notes)

Related #468

check-size is pre-existing/advisory (CLAUDE.md was 39347 chars at base,
already >25k; #468 merged through the same failing check). This commit
minimizes 049's footprint regardless.
Dumbris added a commit that referenced this pull request May 18, 2026
* docs(spec): agent-discoverable disabled tools design

Follow-up to PR #468. Opt-in include_disabled on retrieve_tools +
conditional counts on upstream_servers, 5-state classifier, reactive
discovery nudges. Token-cost-zero until exercised.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(049): speckit spec for agent-discoverable disabled tools

Converts the brainstormed design (docs/superpowers/specs/2026-05-18-...)
into canonical speckit format. Opt-in include_disabled discovery,
5-state status + remediation map, conditional per-server counts,
reactive nudges. No enforcement/storage change.

Related #468

## Changes
- specs/049-agent-discoverable-disabled-tools/spec.md
- specs/049-agent-discoverable-disabled-tools/checklists/requirements.md

## Testing
- N/A (spec only); requirements checklist all items pass

* docs(049): speckit plan + phase-0/1 artifacts

Related #468

## Changes
- plan.md (constitution check PASS, no violations)
- research.md (decisions + codebase facts; no open unknowns)
- data-model.md (5-state status enum, additive response shapes)
- contracts/mcp-deltas.md (retrieve_tools + upstream_servers deltas)
- quickstart.md (curl + live-MCP verification recipe)
- CLAUDE.md active-technologies updated by speckit agent-context

## Testing
- N/A (planning docs only)

* docs(049): speckit tasks (18, TDD, by user story)

Related #468

## Changes
- tasks.md: Setup, Foundational (classifier+types), US1 MVP, US2, US3, Polish

## Testing
- N/A (task plan only)

* feat(049): foundational classifier + additive response types

Related #468

## Changes
- contracts: DisabledToolStatus consts, LockedToolEntry, ServerToolCounts
- runtime: ClassifyDisabledTool (pure, 5-state precedence, read-only)
- tests: precedence/unknown/pending table + 1k benchmark (6.6 ns/op)

## Testing
- go test ./internal/runtime -run TestClassifyDisabledTool: PASS
- BenchmarkClassifyDisabledTool: 6.6 ns/op (<<100ms budget)

* feat(049): US1 — opt-in include_disabled tool discovery

Related #468

## Changes
- retrieve_tools: include_disabled param + one-line schema hint
- handler: split callable/disabled, agent-scope before classify,
  cap min(limit,10), once-per-response remediation map
- p.classifyDisabledTool (storage-sourced, mirrors runtime classifier)
- in-memory include_disabled usage counter (no persistence)

## Testing
- TestDisabledDiscovery_{DefaultPathUnchanged,OptIn,CapAtTen,UsageCounter}: PASS
- regressions (ExcludesDisabled, CallBlockedTool, BlockedToolMessageFor): PASS

* feat(049): US2 — discovery pointer + zero-result nudge

Related #468

## Changes
- blockedToolMessageFor: append include_disabled:true pointer (both branches)
- retrieve_tools: when 0 callable + droppedCount>0 + flag off, emit a
  one-line 'notice' count nudge (no inline entries)

## Testing
- TestBlockedToolMessage_DiscoveryPointer, TestDisabledDiscovery_ZeroResultNudge: PASS

* feat(049): US3 — conditional per-server tool counts

Related #468

## Changes
- upstream_servers list/get: emit ServerToolCounts only when a
  non-callable count > 0 (all-callable servers gain 0 bytes)
- classifyServerToolStatus: single storage-sourced truth (runtime-
  independent); classifyDisabledTool now delegates to it (no drift)

## Testing
- TestServerToolCounts_Conditional + full server/runtime suites: PASS

* docs(049): built-in tools note for include_disabled + server counts

Related #468

* test(049): live curl+MCP verification results + task closeout

Related #468

## Changes
- quickstart.md: verification-results table (§1-§5 PASS), documents the
  pre-existing #468 config->runtime disabled_tools gap (out of 049 scope)
- tasks.md: all phases checked off

## Testing
- runtime/server/contracts unit suites: PASS; verify-oas: PASS
- live MCP+curl: default unchanged, include_disabled, counts, blocked msg

* docs(049): trim CLAUDE.md additions (speckit auto-append cruft + terser notes)

Related #468

check-size is pre-existing/advisory (CLAUDE.md was 39347 chars at base,
already >25k; #468 merged through the same failing check). This commit
minimizes 049's footprint regardless.

---------

Co-authored-by: Claude Code <noreply@anthropic.com>
Dumbris added a commit that referenced this pull request May 18, 2026
…477)

The merged #476 classifyServerToolStatus read config-denial only from the
storage ServerConfig copy. Config-file disabled_tools on stdio servers
live in the live runtime config and are not always mirrored to storage,
so disabled_by_config classification + upstream_servers counts were wrong
at runtime for that case (unit tests passed because the no-runtime
harness uses the storage fallback path).

Prefer p.mainServer.runtime.IsToolConfigDenied (same authority
isToolCallable/blockedToolMessage use); fall back to storage
IsToolAllowedByConfig only when no runtime is wired.

Adds a runtime regression test pinning that ClassifyDisabledTool reads
live config (the authority the server layer now delegates to).

Related #468
Dumbris added a commit that referenced this pull request May 18, 2026
Pairs with the Runtime.Close Docker-verify fix. The fast unit step
now passes -short (activating the testing.Short guards) with a 4m
race timeout for headroom; the heavy property/timing tests run
unguarded in the main-only stress-tests job so coverage is preserved.

Related #468
Dumbris added a commit that referenced this pull request May 18, 2026
…waste (#478)

* fix(ci): skip Docker-cleanup verification when isolation unused

Root cause of the chronic E2E red (failing on every main commit for
weeks; v0.31.0/.1 shipped through it): Runtime.Close() and
Server.Stop() always ran a Docker container-cleanup verification —
a docker ps probe + a 15x1s poll loop + 2s sleep (~17s) — even when
no Docker isolation was ever used. Every runtime-constructing unit
test paid ~17s in t.Cleanup; across ~dozens of tests under -race the
internal/runtime package hung past 12 min, blowing CI's
-race -timeout 2m step.

Fix: add Manager.UsesDockerIsolation() (same predicate as the Docker
recovery monitor) and gate both cleanup-verification blocks on it.
No isolation => no managed containers => nothing to verify. Production
behavior with Docker isolation is unchanged.

Also add testing.Short() guards to the heaviest property/timing tests
(TestRapidQuarantineStateMachine ~270s, TestRapidInvariant ~70s, and
five ~18s timing tests) so the fast unit step stays lean; they still
run in the main-only stress-tests job.

Result (local, exact CI condition -short -race -timeout 2m):
- internal/runtime: hang >12m  -> ok 20s
- internal/server:  slow       -> ok 18s
- internal/upstream/contracts: ok

Related #468

* ci(e2e): -short the fast unit step; heavy tests move to stress job

Pairs with the Runtime.Close Docker-verify fix. The fast unit step
now passes -short (activating the testing.Short guards) with a 4m
race timeout for headroom; the heavy property/timing tests run
unguarded in the main-only stress-tests job so coverage is preserved.

Related #468
Dumbris added a commit that referenced this pull request May 18, 2026
The first revision string-interpolated t.TempDir() into a JSON literal;
on windows-amd64 the backslash path produced invalid JSON, failing the
test (caught by the Build Binaries windows job). Marshal the config via
encoding/json so the path is escaped on every OS.

Related #468
Dumbris added a commit that referenced this pull request May 18, 2026
…ression guard) (#479)

* fix(configsvc): deep-copy tool-filter slices in Snapshot.Clone

Investigation of the reported 'config-file disabled_tools does not reach
runtime' concern found NO reproducible code bug: every boundary (file
parse, storage Save/Get/List, and the full path
parse->New->LoadConfiguredServers->SaveConfiguration->IsToolConfigDenied)
provably preserves enabled_tools/disabled_tools. The original manual
observation was a test-environment artifact (reused /tmp data-dir while
SaveConfiguration rewrites the config file + hot-reload watcher).

The investigation did surface one real latent defect: Snapshot.Clone()
deep-copied Args/Headers/Env/OAuth/Isolation but shallow-aliased the
spec-049 EnabledTools/DisabledTools slices via 'clonedSrv := *srv'. Not
data loss (reads see the aliased value) but an immutability violation:
mutating a cloned config's filter would corrupt the shared backing array.

- Deep-copy EnabledTools/DisabledTools in Snapshot.Clone
- TestSnapshotClone_DeepCopiesToolFilters (aliasing regression)
- TestConfigFileToolFilter_ReachesRuntime (permanent full-path guard)
- correct the spec-049 quickstart note (no #468 gap exists)

Related #468

* test(049): make TestConfigFileToolFilter_ReachesRuntime cross-platform

The first revision string-interpolated t.TempDir() into a JSON literal;
on windows-amd64 the backslash path produced invalid JSON, failing the
test (caught by the Build Binaries windows job). Marshal the config via
encoding/json so the path is escaped on every OS.

Related #468
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants