Skip to content

data-api: follow auth context via /.well-known/entire-api.json#1377

Merged
toothbrush merged 15 commits into
mainfrom
cor-389-data-api-context-aware
Jun 9, 2026
Merged

data-api: follow auth context via /.well-known/entire-api.json#1377
toothbrush merged 15 commits into
mainfrom
cor-389-data-api-context-aware

Conversation

@toothbrush

@toothbrush toothbrush commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

https://entire.io/gh/entireio/cli/trails/527

activity / search / trail / dispatch now pick the right login automatically when you point only at a data-API host:

ENTIRE_API_BASE_URL=https://partial.to entire activity   # no ENTIRE_AUTH_BASE_URL needed

The CLI reads the host's /.well-known/entire-api.json to learn which login servers it trusts (and which audience to exchange for), then selects the matching auth context — same cluster semantics git clone already uses (active-wins-if-eligible → sole → explicit choice). This fixes the cross-core case: active context is a prod entire.io login, but the command targets partial.to.

Falls back to today's static token resolution when the host doesn't advertise discovery (404 / unreachable / 503), so nothing breaks before the server side ships.

Server side: entirehq/entire.io#2277.

Changes

  • clusterdiscovery: generalized to serve both entire-cluster.json and entire-api.json (shared selectContext); DiscoverAPI / ResolveContextForAPI.
  • auth: NewRefreshingResourceProvider (per-context, audience-aware RFC 8693 exchange) + ResolveDataAPIToken (discovery + fallback). Wired into NewAuthenticatedAPIClient, dispatch lookupResourceToken, search resolveSearchToken.
  • Doc: docs/architecture/upstream-host-resolution.md.

Note: audience = the data host origin (https://entire.io / https://partial.to), not an opaque string — confirmed against entire.io's ENTIRE_CORE_JWT_AUDIENCE and entiredb's api-access exchange.

Known follow-ups (not addressed here)

Two non-refresh edge cases remain in this slice — flagged so they aren't read as "done":

  1. Discovery-unavailable fallback doesn't refresh and is multi-core-naive. When a host doesn't advertise /.well-known/entire-api.json, ResolveDataAPIToken drops to the singleton TokenForResource (issuer pinned to ENTIRE_AUTH_BASE_URL). If the active context lives on a different core, the exchange targets the wrong core and won't refresh, even with a usable refresh token. Only hit on un-rolled-out deployments ("never worse than before"); the long-term fix is to route the fallback through the active context's refreshing provider.
  2. Legacy-bridge asymmetry. ResolveDataAPIToken calls MigrateLegacyLoginContext; ResolveControlPlaneTarget does not — so a legacy-only login gets bridged+refreshed on a data command but falls to the non-refreshing static path on a control-plane command. Practical impact ≈ nil (legacy logins predate offline_access), but the two entry points should converge.

RepoScopedToken (still a direct, non-refreshing STS exchange) and the dead resolveAuthHostToken are tracked separately in COR-395.

🤖 Generated with Claude Code


Note

High Risk
Changes authentication for all data-API CLI entry points (token exchange, context selection, and discovery fallback), which are security-critical and affect multi-core staging vs prod workflows.

Overview
Data API auth now follows the target host, not only the active context and ENTIRE_AUTH_BASE_URL. Commands that dial ENTIRE_API_BASE_URL (activity, search, trail, dispatch, and NewAuthenticatedAPIClient) call auth.ResolveDataAPIToken instead of TokenForResource on an origin-only URL.

That path fetches /.well-known/entire-api.json, selects a local context with the same rules as git cluster discovery (active-if-eligible → sole eligible → explicit choice), and mints a bearer via NewRefreshingResourceProvider (per-context c.CoreURL, RFC 8693 exchange for the document’s audience). If discovery is missing or unusable (ErrDiscoveryUnavailable), behavior falls back to the old singleton TokenForResource path; real selection failures still surface to the user.

clusterdiscovery adds DiscoverAPI / ResolveContextForAPI, shares selectContext and a hardened fetchWellKnownJSON (HTTPS-only, no redirects) with cluster discovery, and updates login hints. refresh.go centralizes newContextTokenManager and reauthError so ErrNotLoggedIn still unwraps on the discovery path. Architecture docs mark the web/data API resolution slice as done.

Reviewed by Cursor Bugbot for commit f883714. Configure here.

activity/search/trail/dispatch now discover which login server the data
API host trusts (and which audience to exchange for) from its
/.well-known/entire-api.json, then pick the matching auth context and
exchange that context's token — the same cluster semantics git already
uses. So `ENTIRE_API_BASE_URL=https://partial.to entire activity`
authenticates as the partial.to login without also setting
ENTIRE_AUTH_BASE_URL.

Falls back to today's static token resolution when the host doesn't
advertise discovery (404/unreachable/503), so nothing breaks pre-deploy.

- clusterdiscovery: generalized to serve both entire-cluster.json and
  entire-api.json (shared selectContext); DiscoverAPI/ResolveContextForAPI
- auth: NewRefreshingResourceProvider (per-context, audience-aware
  exchange) + ResolveDataAPIToken (discovery + fallback)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 42efb4b17beb
Copilot AI review requested due to automatic review settings June 5, 2026 02:47
Comment thread cmd/entire/cli/auth/refresh.go Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds data-API “auth context discovery” so commands that target a data host (activity/search/trail/dispatch + the authenticated API client) can automatically pick the correct login context by fetching /.well-known/entire-api.json from the data host, then exchanging for the advertised audience. It preserves existing behavior by falling back to the current static token resolution when discovery is unavailable.

Changes:

  • Generalize cluster discovery selection logic to also support API-host discovery (/.well-known/entire-api.json) and reuse the same context-selection semantics.
  • Introduce data-API token resolution (auth.ResolveDataAPIToken) and a per-context, audience-aware token exchange provider (auth.NewRefreshingResourceProvider), wiring it into search/dispatch and the authenticated API client.
  • Update architecture documentation to reflect the new discovery mechanism and audience semantics.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
internal/entireclient/clusterdiscovery/resolve.go Generalizes context selection messaging to work for both clusters and API hosts.
internal/entireclient/clusterdiscovery/discovery.go Extracts shared well-known JSON fetch helper and refactors cluster discovery to use it.
internal/entireclient/clusterdiscovery/api_discovery.go Adds API-host discovery parsing and context resolution using trusted_issuers + audience.
internal/entireclient/clusterdiscovery/api_discovery_test.go Adds tests for API discovery behavior, error folding, and context selection semantics.
docs/architecture/upstream-host-resolution.md Documents the new /.well-known/entire-api.json flow and audience exchange behavior.
cmd/entire/cli/search_cmd.go Switches search token resolution to the new discovery-aware resolver.
cmd/entire/cli/dispatch/mode_local.go Switches dispatch token lookup wiring to use the discovery-aware resolver.
cmd/entire/cli/auth/refresh.go Adds NewRefreshingResourceProvider to exchange a context’s login JWT for a resource audience.
cmd/entire/cli/auth/data_api.go Implements ResolveDataAPIToken (discovery + selection + exchange + fallback).
cmd/entire/cli/auth/data_api_test.go Adds tests covering fallback behavior, surfacing selection errors, and exchange parameters.
cmd/entire/cli/api_client.go Updates the authenticated API client to use ResolveDataAPIToken instead of static resolution.

Comment thread internal/entireclient/clusterdiscovery/discovery.go Outdated
Comment thread cmd/entire/cli/auth/refresh.go Outdated
toothbrush and others added 3 commits June 5, 2026 11:47
NewRefreshingLoginProvider and NewRefreshingResourceProvider shared ~30
near-identical lines (validation, tokenmanager.New, the reauth error
switch). Extract newContextTokenManager + contextReauthError; the two
providers now differ only in Refresh() vs Token(req) and the residual
error wording.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 6f9136693ab0
The "where do I log in" hints now just say `entire login` (plus
`entire auth use` to switch between existing logins) instead of
`ENTIRE_AUTH_BASE_URL=<url> entire login`. The env-var override stays a
power-user mechanism; fully sunsetting it (+ `entire login --server`) is
tracked in COR-393.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 93d30f60bf97
…very redirects

Two reviewer findings:

- ErrNotLoggedIn lost after discovery (cursor + Copilot): contextReauthError
  returned a plain string, so callers that branch on errors.Is(err,
  ErrNotLoggedIn) (NewAuthenticatedAPIClient/search/dispatch) fell through to
  their generic error — a regression vs the pre-discovery TokenForResource
  path. Wrap the sentinel via a reauthError type that keeps the friendly
  context-named message while unwrapping to the tokenmanager sentinel.

- Redirect-following in fetchWellKnownJSON (Copilot): a trust-root fetch must
  not follow a 3xx to another origin/plaintext. Refuse redirects on a
  shallow-copied client (so the caller's redirect policy is untouched). Low
  real exploitability — the token is never sent to the redirect target — but
  cheap hardening that covers the cluster path too.

Tests: provider error unwraps to ErrNotLoggedIn; cross-origin redirect (to a
server serving a valid doc) is refused rather than followed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: d24466d6862b
@toothbrush

Copy link
Copy Markdown
Contributor Author

@cursor review

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit f883714. Configure here.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Comment thread internal/entireclient/clusterdiscovery/discovery.go
Comment thread internal/entireclient/clusterdiscovery/discovery.go Outdated
Comment thread cmd/entire/cli/auth/refresh.go
Comment thread cmd/entire/cli/auth/refresh.go
toothbrush and others added 7 commits June 6, 2026 16:03
The CLI never fetches JWKS — that's a server-side verification concern —
so modelling the field only created a name/shape coupling to the server.
entire.io#2281 renames it to jwks_uris (plural); rather than chase that,
drop the field entirely. Go ignores unknown JSON fields on decode, so the
server can evolve it freely. Test body now sends jwks_uris to prove we
tolerate it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: cb5ddec166d9
…rmetic

Audit of outbound API calls found `newRecapClient` still resolving its
bearer via static `auth.TokenForResource` against the data host
(`api.BaseURL()`), bypassing the discovery + context-selection path the
other data-API commands use. Switch it to `auth.ResolveDataAPIToken` so
recap follows the active auth context like activity/search/dispatch.

The audit also surfaced that the activity unit tests were no longer
hermetic: now that activity/recap/search go through ResolveDataAPIToken,
their resolution does a live `/.well-known/entire-api.json` fetch against
the configured data host — which bypasses SetManagerForTest and hit the
real entire.io once #2277 deployed. Add an explicit discovery seam
(auth.SetResolveContextForAPIForTest / DiscoveryUnavailableForTest) and
use it in the two runActivity tests so they exercise the static fallback
through the singleton test manager with no network.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 3d80b3e46e06
status/logout resolved the active context's bearer with a raw keyring read
(LoginTokenForContext), so an expired-but-refreshable session reported
"re-login" — the exact false negative COR-389 fixed for control-plane
commands, leaving `entire activity` (silently refreshes) inconsistent with
`entire auth status` (told you to re-login) for the same token.

resolveStatusTarget now resolves the active context through a refreshing
provider (auth.RefreshedLoginToken), falling back to the stored token when
refresh fails so a genuinely dead session (ErrReauthRequired → expired
token → /me 401) still surfaces "no longer valid". logout benefits too: the
refreshed bearer authenticates the revoke call instead of failing on an
expired token.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 2806fe7e05e7
Discovery was re-fetched on every activity/search/trail/dispatch/recap
invocation. Cache it the same way the git path caches entire-cluster.json:
reuse the generic cache primitives (modifyCacheFile/loadCacheFile/...) for a
new api_discovery.json sibling, 24h TTL, with stale-fallback when a re-fetch
fails. The entry carries `audience` alongside issuer/trusted_issuers — the one
field that distinguishes a resource API from a git cluster.

- discovery: APIDiscoveryCache / APIDiscoveryEntry (mirrors ClusterCoresCache)
- clusterdiscovery: resolveAPIDoc (mirrors resolveClusterCores); ResolveContextForAPI
  gains a cacheDir param and goes cache-then-/.well-known
- auth: ResolveDataAPIToken passes discovery.DefaultCacheDir(); test seam +
  DiscoveryUnavailableForTest grow the cacheDir param

Behavioural upside: a transient discovery outage now reuses last-known-good
trust roots instead of dropping to the static fallback — aligning with COR-393's
mandatory-discovery direction. Cold failure (no cache entry) still falls back.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 99ecee35ca11
aud == base URI holds on both data-API environments (entire.io / partial.to),
and the token manager already defaults the RFC 8693 audience to the resource
origin it's dialing. So the CLI never needs the advertised `audience`: it
derives it from the host. That removes the one field distinguishing the API
discovery cache from the git cluster's cores cache.

Fold accordingly:
- discovery: api_discovery.json reuses ClusterCoresCache via LoadAPICores/
  ModifyAPICores (deleted the bespoke APIDiscoveryCache/Entry); cores cache now
  serves both clusters and data APIs, two files.
- clusterdiscovery: APIResponse slims to {trusted_issuers}; DiscoverAPI requires
  only trusted_issuers; resolveAPITrustedIssuers mirrors resolveClusterCores;
  ResolveContextForAPI returns just the context (no doc).
- auth: NewRefreshingResourceProvider drops the audience param (token manager
  defaults aud to the resource origin); ResolveDataAPIToken + seams updated.

Trades away "server changes audience without a CLI release" — fine, aud == base
URI is a hard requirement both envs. Doc updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 75020e07c077
resolveAuthHostToken had no production callers (tests only) — drop it and
its dedicated tests. Keep the shared token-manager test helpers
(authMemStore / saveCoreToken / newResolveTestManager) that the data-API
resolution tests in activity_cmd_test.go depend on.

Rewrite the RepoScopedToken doc comment: it still does a direct,
non-refreshing STS exchange, and its old "device flow stores no refresh
token" rationale is now stale (login requests offline_access and persists
a refresh token). Mark it as slated for the COR-395 rework.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: cb812912363c
@toothbrush toothbrush marked this pull request as ready for review June 6, 2026 08:39
@toothbrush toothbrush requested a review from a team as a code owner June 6, 2026 08:39
@toothbrush toothbrush force-pushed the cor-389-data-api-context-aware branch from 7f72def to d471460 Compare June 9, 2026 02:27
@toothbrush toothbrush force-pushed the cor-389-data-api-context-aware branch from 898bc02 to 16df5b3 Compare June 9, 2026 02:32
d471460 squashed the advertised login-server list and `entire auth use`
hint out of renderLoginHint (until the multi-login UX is ready) but left
two tests asserting the old richer message, breaking test-core CI:
TestRenderLoginHint and TestResolve_NoEligibleContextReturnsLoginHint.

Update both to assert the new message (and that the URLs / auth-use hint
are intentionally absent), and refresh the stale RenderLoginHint doc
comment that still claimed "one indented URL per line".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: c088e375b2f5
@toothbrush toothbrush enabled auto-merge June 9, 2026 03:59
@toothbrush toothbrush merged commit 7ef77ea into main Jun 9, 2026
9 checks passed
@toothbrush toothbrush deleted the cor-389-data-api-context-aware branch June 9, 2026 04:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants