Skip to content

Latest commit

 

History

History
148 lines (101 loc) · 13.1 KB

File metadata and controls

148 lines (101 loc) · 13.1 KB

Canonry MCP Stdio Adapter

Canonry is CLI/API-first. MCP exists to make that same public surface easier to use from MCP clients such as Claude Desktop, Codex, and custom agent shells that prefer a tool catalog over shell commands or raw HTTP.

MCP is useful here because many agent clients can discover typed tools, validate arguments, and call them without asking the user to compose curl or canonry ... --format json invocations. It is not more authoritative than the API or CLI. canonry-mcp is an adapter over createApiClient() only, so it must not expose capabilities that do not already exist through Canonry's public API/CLI.

New public API/CLI capabilities should get MCP parity by default. If a capability is intentionally not exposed as an MCP tool, classify its OpenAPI operation as deferred or excluded-protocol in packages/canonry/src/mcp/openapi-classification.ts and include the reason there. Credential, bearer-token, browser-session, and other high-risk operations may be deferred, but they should be explicit exceptions rather than silent omissions.

Install

Install Canonry normally:

npm install -g @ainyc/canonry

The package exposes one MCP executable:

canonry-mcp

canonry-mcp itself stays out of the main CLI to keep stdio clean — telemetry, help text, or stray logs would corrupt the protocol. The main CLI does ship two read/write helpers that operate on client config files only:

canonry mcp install --client claude-desktop
canonry mcp install --client cursor --read-only
canonry mcp config  --client codex            # print snippet for clients without auto-install

install merges a canonry MCP server entry into the client's config (creating the file if needed, backing up the original to <config>.canonry.bak). It is idempotent — re-running with the same flags is a no-op. config prints the snippet to stdout for copy-paste or use in unsupported clients (currently Codex CLI, since it uses TOML). Both helpers accept --name <server> to install under a custom key, --read-only to scope to the 69 read API tools, --dry-run (install only), and --format json for machine-readable output.

Auth

canonry-mcp inherits the normal local config at ~/.canonry/config.yaml through createApiClient().

For a local server, use the same config created by canonry init and run canonry serve. For a remote API, set apiUrl and apiKey in ~/.canonry/config.yaml. MCP adds no OAuth flow, token storage, or alternate auth path.

Client Config

Claude Desktop:

{
  "mcpServers": {
    "canonry": {
      "command": "canonry-mcp",
      "args": []
    }
  }
}

Read-only mode:

{
  "mcpServers": {
    "canonry": {
      "command": "canonry-mcp",
      "args": ["--read-only"]
    }
  }
}

Codex-style TOML:

[mcp_servers.canonry]
command = "canonry-mcp"
args = []

Tool Surface

v1 is curated for client usability: 114 API tools (77 read in --read-only) plus two meta-tools (canonry_help, canonry_load_toolkit). It covers projects, project-overview and search composites, citation/mention trend analytics (canonry_analytics_metrics), cited-source rankings (canonry_analytics_sources — the full ranked + per-provider + classified cited-domain surface), aggregated per-query mention/citation stats with sample size (canonry_visibility_stats — confidence-aware proportions, optional per-provider), config apply, runs, snapshots, insights, health, query generation and replacement, legacy keyword aliases, competitor add/remove, schedules, settings, GSC reads, GA reads, GBP local-AEO reads (incl. canonry_gbp_attributes — owner-set Business Profile attributes across categories, and canonry_gbp_places — the Places rendered-listing cross-reference), server-side traffic ingestion (Cloud Run / WordPress / Vercel connect/sync + async backfill + crawler/AI-referral rollup reads), OpenAI ads paid-surface reads + sync (canonry_ads_status / _campaigns / _insights / _summary / _sync — campaign snapshots with context hints, daily spend rollups in integer micros), the doctor health-check (Google/GA auth diagnostics), Technical AEO site audits (canonry_technical_aeo_run / _score / _pages / _trend — sitemap-driven per-page audit + site score), run trigger/cancel, schedule updates, insight dismiss, content gap/target/source analysis, the winnabilityClass gate (canonry_content_map — per-domain cited-surface classifications) and structured brief synthesis (canonry_content_brief — gated to ownable targets), source-aware backlinks (canonry_backlinks_domains reads either Common Crawl or Bing Webmaster via source; canonry_backlinks_sources reports per-source availability), durable Aero memory (list/set/forget), agent transcript clear, agent webhook attach/detach, and the tracked-basket discovery pipeline (start a session, list sessions, inspect probes, harvest the model's issued search-query fan-out as gated candidate seeds, preview promotion candidates, promote cited + aspirational findings from a completed session into the tracked basket).

canonry_apply_config accepts one config-as-code project document per call. For multi-document YAML or multiple project files, agents should call the tool once per project document. canonry_queries_generate returns suggestions only; persist accepted suggestions with canonry_queries_add or replace the tracked set with canonry_queries_replace. The canonry_keywords_* tools remain as legacy aliases over the same query store for older clients.

Deferred from v1: Aero ask SSE, OAuth callbacks, raw screenshots, project delete, snapshot generation, broad admin/provider writes, Google/Bing/GA connect/sync/inspect/indexing writes, WordPress writes, CDP screenshot, generic notifications, backlinks, raw OpenAPI, and raw HTTP escape hatches.

Some write tools compose existing API calls rather than using a native atomic endpoint. The agent webhook attach/detach tools are best-effort under concurrent calls until the public API grows narrower attach/detach operations for that domain.

canonry_project_upsert and canonry_apply_config use PUT semantics — fields omitted from the request are reset to their defaults. Pass the full intended project shape. canonry_apply_config accepts one project document per call; loop on the client side for multi-project configs.

Progressive Tool Discovery

The full 80-tool API catalog costs roughly 17k tokens of definitions every session. Most sessions touch a handful of tools, so canonry-mcp defaults to a small core tier (~10 tools, ~3k tokens) and registers the rest on demand via notifications/tools/list_changed.

Core tier (always loaded):

  • canonry_help — list available toolkits and which are loaded
  • canonry_load_toolkit — register a toolkit's tools for the rest of the session
  • canonry_projects_list, canonry_project_get
  • canonry_project_overview — composite read for "how is project X doing?"
  • canonry_search — composite text search across snapshots and insights
  • canonry_doctor — run health checks (Google/GA auth, redirect URI, scopes, providers); filter by check id or wildcard
  • canonry_settings_get
  • canonry_apply_config, canonry_run_trigger, canonry_run_cancel
  • canonry_agent_webhook_attach

Toolkits (loaded on demand):

Toolkit What's in it When to load
monitoring runs list/latest/get, project history, timeline, snapshots list/diff, insights list/get, health latest/history, content targets/sources/gaps, canonry_report (aggregated AEO report bundle) Investigating regressions, comparing runs, reviewing insights/health, surfacing content opportunities, generating client-facing reports
setup project export/upsert, queries list/add/remove/replace/generate, legacy keyword aliases, competitors list/add/remove, schedule get/set/delete, insight dismiss, backlinks domains Onboarding a project, editing queries/competitors/schedules, reviewing backlink coverage
gsc google connections list, GSC performance, inspections, coverage, coverage history, sitemaps, deindexed Indexing, coverage, sitemap analysis from Google Search Console
ga GA status, traffic, coverage, AI/social referral history, social/attribution trends, session history Traffic, referral, attribution data from Google Analytics 4
ads OpenAI ads (ChatGPT ads) connection status, campaign/ad-group snapshots incl. context hints, daily paid-performance rollups (spend in integer micros, derived ctr/cpc), composite summary, ads-sync trigger The project runs ChatGPT ads and you need paid performance or campaign structure
traffic List sources, source detail (24h totals + latest run), windowed crawler/AI-referral events, Cloud Run / WordPress / Vercel connect, sync, async backfill (replaces hourly rollups in a --days window with current classifier output) Confirming server-log evidence of crawler hits or AI-referral sessions (e.g. GPTBot, ChatGPT-User), wiring up / syncing a Cloud Run, WordPress, or Vercel traffic source, or one-shot reclassifying historical logs after a classifier change
agent Aero memory list/set/forget, agent clear, agent webhook detach Reading or writing project-scoped Aero notes, clearing a stuck conversation, removing an agent webhook
discovery Start a discovery session (canonry_discover_run_start), list sessions, inspect a session's probes + buckets, harvest the model's issued search-query fan-out as gated candidate seeds (canonry_discover_harvest), preview promotion candidates, and promote a completed session's cited + aspirational queries plus recurring competitors into the tracked basket (canonry_discover_promote) Expanding or auditing a project's tracked-query basket, auditing competitive surface, adopting discovered findings into the project

Loading a toolkit is idempotent and persists for the rest of the session; there is no unload. canonry_load_toolkit returns { status: 'loaded' \| 'already-loaded' \| 'empty', name, tools }. The server coalesces all enable/disable side effects into one notifications/tools/list_changed per call, fired just before the response — so a single call refreshes the client's catalog once regardless of how many tools the toolkit contains.

Wait for the response before pipelining

canonry_load_toolkit runs the enable side effect synchronously inside the call's handler, but the newly registered tools only become callable after the response is returned to the client. Always await the response before issuing a tools/call for a tool that the toolkit just enabled. Pipelining the two requests on the same connection (sending tools/call for canonry_insights_list immediately after canonry_load_toolkit without awaiting the load response) can race the registration and produce MCP error -32602: Tool ... disabled. Sequenced clients (Claude Desktop, Cursor, Codex) already wait by default; only batch test harnesses or custom clients risk this.

Eager mode

Power-user environments (scripts, Aero, telemetry harnesses) that want the flat 78-tool catalog at startup, including the two meta-tools, can opt back in with --eager (or CANONRY_MCP_EAGER=1):

{
  "mcpServers": {
    "canonry": { "command": "canonry-mcp", "args": ["--eager"] }
  }
}

--eager and --read-only compose: canonry-mcp --eager --read-only registers every read tool eagerly.

Read-only scope and toolkits

--read-only filters out write tools before the catalog is built, so toolkits with no read tools appear as empty from canonry_load_toolkit. Mixed toolkits load with whatever survives the filter — the agent toolkit, for example, drops its writes (canonry_memory_set, canonry_memory_forget, canonry_agent_clear, canonry_agent_webhook_detach) and exposes only canonry_memory_list under read-only scope.

Read-only API keys (auto-detection)

A read-only API key (canonry key create --read-only, scopes ['read']) is rejected by the API on every write HTTP method (403 FORBIDDEN). To avoid advertising tools that would 403 at call time, canonry-mcp probes GET /keys/self at startup and, when its configured key is read-only, auto-restricts the catalog to read tools — exactly as if --read-only had been passed — and prints a one-line notice on stderr. The probe is best-effort: if the API is unreachable or the server predates the endpoint, the adapter keeps the requested scope. A read-only key can only ever narrow the catalog, never widen it; passing --read-only explicitly skips the probe.

Safety Rules

MCP uses stdio, so any normal stdout write breaks the protocol. Code under packages/canonry/src/mcp/ must not use console.log, process.stdout.write, CLI dispatch, telemetry, logger imports, DB imports, route imports, or job-runner imports. Tool handlers call createApiClient() only.

Tool input schemas are Zod schemas tied to packages/contracts and exposed as JSON Schema for MCP clients. Canonry API/client errors and Zod input-validation errors return MCP tool results with isError: true and a structured { "error": { "code", "message", "details" } } envelope (VALIDATION_ERROR for bad input, with details.issues listing the per-field problems). Malformed JSON-RPC and unknown tools remain MCP protocol errors.