data-api: follow auth context via /.well-known/entire-api.json#1377
Conversation
activity/search/trail/dispatch now discover which login server the data API host trusts (and which audience to exchange for) from its /.well-known/entire-api.json, then pick the matching auth context and exchange that context's token — the same cluster semantics git already uses. So `ENTIRE_API_BASE_URL=https://partial.to entire activity` authenticates as the partial.to login without also setting ENTIRE_AUTH_BASE_URL. Falls back to today's static token resolution when the host doesn't advertise discovery (404/unreachable/503), so nothing breaks pre-deploy. - clusterdiscovery: generalized to serve both entire-cluster.json and entire-api.json (shared selectContext); DiscoverAPI/ResolveContextForAPI - auth: NewRefreshingResourceProvider (per-context, audience-aware exchange) + ResolveDataAPIToken (discovery + fallback) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 42efb4b17beb
There was a problem hiding this comment.
Pull request overview
This PR adds data-API “auth context discovery” so commands that target a data host (activity/search/trail/dispatch + the authenticated API client) can automatically pick the correct login context by fetching /.well-known/entire-api.json from the data host, then exchanging for the advertised audience. It preserves existing behavior by falling back to the current static token resolution when discovery is unavailable.
Changes:
- Generalize cluster discovery selection logic to also support API-host discovery (
/.well-known/entire-api.json) and reuse the same context-selection semantics. - Introduce data-API token resolution (
auth.ResolveDataAPIToken) and a per-context, audience-aware token exchange provider (auth.NewRefreshingResourceProvider), wiring it into search/dispatch and the authenticated API client. - Update architecture documentation to reflect the new discovery mechanism and audience semantics.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/entireclient/clusterdiscovery/resolve.go | Generalizes context selection messaging to work for both clusters and API hosts. |
| internal/entireclient/clusterdiscovery/discovery.go | Extracts shared well-known JSON fetch helper and refactors cluster discovery to use it. |
| internal/entireclient/clusterdiscovery/api_discovery.go | Adds API-host discovery parsing and context resolution using trusted_issuers + audience. |
| internal/entireclient/clusterdiscovery/api_discovery_test.go | Adds tests for API discovery behavior, error folding, and context selection semantics. |
| docs/architecture/upstream-host-resolution.md | Documents the new /.well-known/entire-api.json flow and audience exchange behavior. |
| cmd/entire/cli/search_cmd.go | Switches search token resolution to the new discovery-aware resolver. |
| cmd/entire/cli/dispatch/mode_local.go | Switches dispatch token lookup wiring to use the discovery-aware resolver. |
| cmd/entire/cli/auth/refresh.go | Adds NewRefreshingResourceProvider to exchange a context’s login JWT for a resource audience. |
| cmd/entire/cli/auth/data_api.go | Implements ResolveDataAPIToken (discovery + selection + exchange + fallback). |
| cmd/entire/cli/auth/data_api_test.go | Adds tests covering fallback behavior, surfacing selection errors, and exchange parameters. |
| cmd/entire/cli/api_client.go | Updates the authenticated API client to use ResolveDataAPIToken instead of static resolution. |
NewRefreshingLoginProvider and NewRefreshingResourceProvider shared ~30 near-identical lines (validation, tokenmanager.New, the reauth error switch). Extract newContextTokenManager + contextReauthError; the two providers now differ only in Refresh() vs Token(req) and the residual error wording. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 6f9136693ab0
The "where do I log in" hints now just say `entire login` (plus `entire auth use` to switch between existing logins) instead of `ENTIRE_AUTH_BASE_URL=<url> entire login`. The env-var override stays a power-user mechanism; fully sunsetting it (+ `entire login --server`) is tracked in COR-393. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 93d30f60bf97
…very redirects Two reviewer findings: - ErrNotLoggedIn lost after discovery (cursor + Copilot): contextReauthError returned a plain string, so callers that branch on errors.Is(err, ErrNotLoggedIn) (NewAuthenticatedAPIClient/search/dispatch) fell through to their generic error — a regression vs the pre-discovery TokenForResource path. Wrap the sentinel via a reauthError type that keeps the friendly context-named message while unwrapping to the tokenmanager sentinel. - Redirect-following in fetchWellKnownJSON (Copilot): a trust-root fetch must not follow a 3xx to another origin/plaintext. Refuse redirects on a shallow-copied client (so the caller's redirect policy is untouched). Low real exploitability — the token is never sent to the redirect target — but cheap hardening that covers the cluster path too. Tests: provider error unwraps to ErrNotLoggedIn; cross-origin redirect (to a server serving a valid doc) is refused rather than followed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: d24466d6862b
|
@cursor review |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit f883714. Configure here.
The CLI never fetches JWKS — that's a server-side verification concern — so modelling the field only created a name/shape coupling to the server. entire.io#2281 renames it to jwks_uris (plural); rather than chase that, drop the field entirely. Go ignores unknown JSON fields on decode, so the server can evolve it freely. Test body now sends jwks_uris to prove we tolerate it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: cb5ddec166d9
…rmetic Audit of outbound API calls found `newRecapClient` still resolving its bearer via static `auth.TokenForResource` against the data host (`api.BaseURL()`), bypassing the discovery + context-selection path the other data-API commands use. Switch it to `auth.ResolveDataAPIToken` so recap follows the active auth context like activity/search/dispatch. The audit also surfaced that the activity unit tests were no longer hermetic: now that activity/recap/search go through ResolveDataAPIToken, their resolution does a live `/.well-known/entire-api.json` fetch against the configured data host — which bypasses SetManagerForTest and hit the real entire.io once #2277 deployed. Add an explicit discovery seam (auth.SetResolveContextForAPIForTest / DiscoveryUnavailableForTest) and use it in the two runActivity tests so they exercise the static fallback through the singleton test manager with no network. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 3d80b3e46e06
status/logout resolved the active context's bearer with a raw keyring read (LoginTokenForContext), so an expired-but-refreshable session reported "re-login" — the exact false negative COR-389 fixed for control-plane commands, leaving `entire activity` (silently refreshes) inconsistent with `entire auth status` (told you to re-login) for the same token. resolveStatusTarget now resolves the active context through a refreshing provider (auth.RefreshedLoginToken), falling back to the stored token when refresh fails so a genuinely dead session (ErrReauthRequired → expired token → /me 401) still surfaces "no longer valid". logout benefits too: the refreshed bearer authenticates the revoke call instead of failing on an expired token. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 2806fe7e05e7
Discovery was re-fetched on every activity/search/trail/dispatch/recap invocation. Cache it the same way the git path caches entire-cluster.json: reuse the generic cache primitives (modifyCacheFile/loadCacheFile/...) for a new api_discovery.json sibling, 24h TTL, with stale-fallback when a re-fetch fails. The entry carries `audience` alongside issuer/trusted_issuers — the one field that distinguishes a resource API from a git cluster. - discovery: APIDiscoveryCache / APIDiscoveryEntry (mirrors ClusterCoresCache) - clusterdiscovery: resolveAPIDoc (mirrors resolveClusterCores); ResolveContextForAPI gains a cacheDir param and goes cache-then-/.well-known - auth: ResolveDataAPIToken passes discovery.DefaultCacheDir(); test seam + DiscoveryUnavailableForTest grow the cacheDir param Behavioural upside: a transient discovery outage now reuses last-known-good trust roots instead of dropping to the static fallback — aligning with COR-393's mandatory-discovery direction. Cold failure (no cache entry) still falls back. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 99ecee35ca11
aud == base URI holds on both data-API environments (entire.io / partial.to),
and the token manager already defaults the RFC 8693 audience to the resource
origin it's dialing. So the CLI never needs the advertised `audience`: it
derives it from the host. That removes the one field distinguishing the API
discovery cache from the git cluster's cores cache.
Fold accordingly:
- discovery: api_discovery.json reuses ClusterCoresCache via LoadAPICores/
ModifyAPICores (deleted the bespoke APIDiscoveryCache/Entry); cores cache now
serves both clusters and data APIs, two files.
- clusterdiscovery: APIResponse slims to {trusted_issuers}; DiscoverAPI requires
only trusted_issuers; resolveAPITrustedIssuers mirrors resolveClusterCores;
ResolveContextForAPI returns just the context (no doc).
- auth: NewRefreshingResourceProvider drops the audience param (token manager
defaults aud to the resource origin); ResolveDataAPIToken + seams updated.
Trades away "server changes audience without a CLI release" — fine, aud == base
URI is a hard requirement both envs. Doc updated.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 75020e07c077
resolveAuthHostToken had no production callers (tests only) — drop it and its dedicated tests. Keep the shared token-manager test helpers (authMemStore / saveCoreToken / newResolveTestManager) that the data-API resolution tests in activity_cmd_test.go depend on. Rewrite the RepoScopedToken doc comment: it still does a direct, non-refreshing STS exchange, and its old "device flow stores no refresh token" rationale is now stale (login requests offline_access and persists a refresh token). Mark it as slated for the COR-395 rework. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: cb812912363c
Entire-Checkpoint: ba57be5b7bdb
7f72def to
d471460
Compare
898bc02 to
16df5b3
Compare
d471460 squashed the advertised login-server list and `entire auth use` hint out of renderLoginHint (until the multi-login UX is ready) but left two tests asserting the old richer message, breaking test-core CI: TestRenderLoginHint and TestResolve_NoEligibleContextReturnsLoginHint. Update both to assert the new message (and that the URLs / auth-use hint are intentionally absent), and refresh the stale RenderLoginHint doc comment that still claimed "one indented URL per line". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: c088e375b2f5
https://entire.io/gh/entireio/cli/trails/527
activity/search/trail/dispatchnow pick the right login automatically when you point only at a data-API host:The CLI reads the host's
/.well-known/entire-api.jsonto learn which login servers it trusts (and which audience to exchange for), then selects the matching auth context — same cluster semanticsgit clonealready uses (active-wins-if-eligible → sole → explicit choice). This fixes the cross-core case: active context is a prodentire.iologin, but the command targetspartial.to.Falls back to today's static token resolution when the host doesn't advertise discovery (404 / unreachable / 503), so nothing breaks before the server side ships.
Server side: entirehq/entire.io#2277.
Changes
clusterdiscovery: generalized to serve bothentire-cluster.jsonandentire-api.json(sharedselectContext);DiscoverAPI/ResolveContextForAPI.auth:NewRefreshingResourceProvider(per-context, audience-aware RFC 8693 exchange) +ResolveDataAPIToken(discovery + fallback). Wired intoNewAuthenticatedAPIClient, dispatchlookupResourceToken, searchresolveSearchToken.docs/architecture/upstream-host-resolution.md.Note: audience = the data host origin (
https://entire.io/https://partial.to), not an opaque string — confirmed against entire.io'sENTIRE_CORE_JWT_AUDIENCEand entiredb's api-access exchange.Known follow-ups (not addressed here)
Two non-refresh edge cases remain in this slice — flagged so they aren't read as "done":
/.well-known/entire-api.json,ResolveDataAPITokendrops to the singletonTokenForResource(issuer pinned toENTIRE_AUTH_BASE_URL). If the active context lives on a different core, the exchange targets the wrong core and won't refresh, even with a usable refresh token. Only hit on un-rolled-out deployments ("never worse than before"); the long-term fix is to route the fallback through the active context's refreshing provider.ResolveDataAPITokencallsMigrateLegacyLoginContext;ResolveControlPlaneTargetdoes not — so a legacy-only login gets bridged+refreshed on a data command but falls to the non-refreshing static path on a control-plane command. Practical impact ≈ nil (legacy logins predateoffline_access), but the two entry points should converge.RepoScopedToken(still a direct, non-refreshing STS exchange) and the deadresolveAuthHostTokenare tracked separately in COR-395.🤖 Generated with Claude Code
Note
High Risk
Changes authentication for all data-API CLI entry points (token exchange, context selection, and discovery fallback), which are security-critical and affect multi-core staging vs prod workflows.
Overview
Data API auth now follows the target host, not only the active context and
ENTIRE_AUTH_BASE_URL. Commands that dialENTIRE_API_BASE_URL(activity,search,trail,dispatch, andNewAuthenticatedAPIClient) callauth.ResolveDataAPITokeninstead ofTokenForResourceon an origin-only URL.That path fetches
/.well-known/entire-api.json, selects a local context with the same rules as git cluster discovery (active-if-eligible → sole eligible → explicit choice), and mints a bearer viaNewRefreshingResourceProvider(per-contextc.CoreURL, RFC 8693 exchange for the document’saudience). If discovery is missing or unusable (ErrDiscoveryUnavailable), behavior falls back to the old singletonTokenForResourcepath; real selection failures still surface to the user.clusterdiscoveryaddsDiscoverAPI/ResolveContextForAPI, sharesselectContextand a hardenedfetchWellKnownJSON(HTTPS-only, no redirects) with cluster discovery, and updates login hints.refresh.gocentralizesnewContextTokenManagerandreauthErrorsoErrNotLoggedInstill unwraps on the discovery path. Architecture docs mark the web/data API resolution slice as done.Reviewed by Cursor Bugbot for commit f883714. Configure here.