Skip to content

bug: Common config sync overwrites custom Codex config.toml — MCP servers regressed to npx mcp-remote #3824

@hrygo

Description

@hrygo

Summary

CC Switch v3.16.1's provider live sync performs a full-file rewrite of ~/.codex/config.toml, which destroys the [mcp_servers] section. Codex Desktop's external agent migration then re-adds missing MCP servers using npx mcp-remote, spawning unmanaged zombie processes.

Environment

  • CC Switch version: 3.16.1
  • OS: macOS 15.5 (Darwin 25.5.0)
  • Codex Desktop: 26.602.40724
  • Affected config: ~/.codex/config.toml

Root Cause (Source Code Verified)

Traced through src-tauri/src/services/ and src-tauri/src/mcp/codex.rs:

Step 1: Provider live sync — full-file rewrite

sync_current_to_live()sync_codex_live() (config.rs:144) → write_codex_provider_live_with_catalog() (codex_config.rs:881) → write_codex_live_atomic() (codex_config.rs:84).

write_codex_live_atomic() writes the provider's settings_config.config (a TOML string stored in the DB providers table) as the entire file content (codex_config.rs:120):

fn write_codex_live_atomic(auth: &Value, config_text_opt: Option<&str>) -> Result<(), AppError> {
    // ...
    write_text_file(&config_path, &cfg_text)?;  // FULL REWRITE — no merge
    Ok(())
}

This config text does NOT contain [mcp_servers], so MCP servers are destroyed.

Step 2: MCP sync removes disabled servers

McpService::sync_all_enabled() (mcp.rs:180) iterates over all MCP servers in CC Switch's DB. For servers with enabled_codex=0, it calls remove_server_from_codex() (codex.rs:406), which removes the server entry from [mcp_servers] in config.toml.

Even if servers survived step 1, they would be removed here because enabled_codex=0 for all servers. The enabled_codex=0 state is correct because CC Switch's validation (validation.rs) only accepts stdio, http, ssestreamable-http is rejected.

Note: The old sync_enabled_to_codex() function (codex.rs:283) is dead code since v3.7.0 — imported but never called. The comment at config.rs:168 confirms: "MCP sync in v3.7.0 is done through McpService, no longer called here."

Step 3: Codex App external migration fills the gap

After CC Switch's sync clears [mcp_servers], Codex Desktop detects the missing MCP servers and re-adds them via its externalAgentConfig/import API, using the original npx mcp-remote format from its migration data.

Impact chain

CC Switch provider sync → [mcp_servers] destroyed → Codex migration → npx mcp-remote → zombie processes
  • Each Codex session spawns 3+ pairs of npx mcp-remote + node mcp-remote processes
  • These processes are never cleaned up — 54 zombie processes observed in a single session
  • Custom fields (model_catalog_json, disabled_tools) are also lost in the full rewrite

Before (user's config)

model_catalog_json = "/Users/user/.codex/cc-switch-model-catalog.json"
disabled_tools = ["view_image"]

[model_providers.zai]
base_url = "http://127.0.0.1:18765/v1"

[mcp_servers.web-search-prime]
type = "streamable-http"
url = "http://127.0.0.1:18765/web-search-prime"

After (CC Switch overwrite)

# model_catalog_json — GONE
# disabled_tools — GONE
# [mcp_servers] — DESTROYED by full rewrite

# Then Codex migration re-adds as:
[mcp_servers.web-search-prime]
command = "npx"
args = ["-y", "mcp-remote", "https://open.bigmodel.cn/api/mcp/web_search_prime/mcp", ...]

Why commonConfigEnabled: false Doesn't Help

The commonConfigEnabled flag controls whether the common config snippet (shared across providers) is applied. But the provider's OWN settings_config.config is always written unconditionally via write_codex_live_atomic() — this is the path that destroys [mcp_servers].

Workaround

Using chflags uchg (macOS immutable flag) to prevent all writes:

chflags uchg ~/.codex/config.toml   # Lock
chflags nouchg ~/.codex/config.toml # Unlock for editing

Suggested Fix

  1. Support streamable-http MCP type: Add streamable-http to the validation whitelist (validation.rs) alongside stdio, http, sse. This would allow sync_single_server_to_codex() to correctly write streamable-http entries.

  2. Use targeted TOML merge instead of full rewrite: In write_codex_live_atomic(), read the existing config.toml and merge only the managed fields, preserving unrecognized sections like [mcp_servers], [plugins], [marketplaces], etc.

  3. Skip MCP removal when server type is unsupported: If a server has an unsupported type (e.g., streamable-http), don't remove it from config.toml — leave it untouched instead of deleting it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions