-
Notifications
You must be signed in to change notification settings - Fork 235
Expand file tree
/
Copy pathresolver.go
More file actions
1341 lines (1253 loc) · 57.4 KB
/
Copy pathresolver.go
File metadata and controls
1341 lines (1253 loc) · 57.4 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
// SPDX-FileCopyrightText: Copyright 2025 Stacklok, Inc.
// SPDX-License-Identifier: Apache-2.0
// Package dcr is the shared RFC 7591 Dynamic Client Registration client used
// by every consumer in the codebase that needs to register a downstream
// OAuth 2.x client at runtime. The package owns the stateful concerns of the
// flow — credential cache, in-process singleflight deduplication, scope-set
// canonicalisation, token-endpoint auth-method selection (with the RFC 7636 /
// OAuth 2.1 S256 PKCE gate), RFC 7591 §3.2.1 expiry-driven cache invalidation,
// the bearer-token transport with redirect refusal, and panic recovery around
// the registration body. Stateless RFC 7591 wire-shape primitives live in
// pkg/oauthproto.
//
// # Concurrency
//
// The package maintains a process-global singleflight keyed on the tuple
// (issuer, redirectURI, scopesHash) so concurrent ResolveCredentials calls
// across all consumers in a single process coalesce when their cache keys
// match. Consumers that share any of those three values will share a flight
// — the deduplication is a feature for the embedded authserver but means
// callers cannot assume per-call-site flight isolation. See the dcrFlight
// doc comment below for the rationale.
//
// # Current API coupling — sub-issue 4a only
//
// As of issue #5145 sub-issue 4a (the slice that lifted this code out of
// pkg/authserver/runner), the public functions on this package take
// embedded-authserver types — *authserver.OAuth2UpstreamRunConfig and
// *upstream.OAuth2Config (with *authserver.DCRUpstreamConfig reached
// transitively via OAuth2UpstreamRunConfig.DCRConfig) — directly on
// their signatures. This matches the embedded authserver's existing
// internal shapes verbatim and was the cheapest move-only change.
//
// The CLI flow migration in sub-issue 4b will introduce the second
// consumer (pkg/auth/discovery::PerformOAuthFlow) and is the right
// trigger for replacing those parameters with a profile-neutral input
// type — designing the neutral shape now, with only one consumer in
// hand, would be speculative. Until 4b lands, callers outside the
// embedded authserver MUST adapt their inputs to the authserver types
// at the call site, and the "profile-agnostic" framing in this package's
// charter is a target state, not the current state of the API.
//
// See issue #5145 for the design discussion that motivated lifting this out
// of pkg/authserver/runner.
package dcr
import (
"bytes"
"context"
"errors"
"fmt"
"log/slog"
"net/http"
"net/url"
"os"
"regexp"
"runtime/debug"
"slices"
"strings"
"time"
"golang.org/x/sync/singleflight"
"github.com/stacklok/toolhive/pkg/authserver"
"github.com/stacklok/toolhive/pkg/authserver/storage"
"github.com/stacklok/toolhive/pkg/authserver/upstream"
"github.com/stacklok/toolhive/pkg/networking"
"github.com/stacklok/toolhive/pkg/oauthproto"
)
// dcrFlight coalesces concurrent ResolveCredentials calls that share the
// same Key. Two goroutines hitting the resolver for the same upstream and
// scope set will both miss the cache, so without coalescing they would both
// call RegisterClientDynamically and the loser's registration would become
// orphaned at the upstream IdP — an operator-visible cleanup task and
// possibly a transient startup failure if the upstream rate-limits
// concurrent registrations. Followers wait for the leader's result and
// observe the same Resolution.
//
// Lifetime: process-wide. This intentionally contrasts with the
// CredentialStore the embedded authserver constructs and injects into
// ResolveCredentials, which is per-instance for the memory backend and
// shared across replicas for Redis. The asymmetry is load-bearing: the
// singleflight only deduplicates the in-flight network call, while the
// cache deduplicates the resolution itself across calls. Process-wide
// flight means concurrent EmbeddedAuthServer instances in the same
// process targeting the same upstream still get deduplicated; the
// injected cache decides whether the resolution is fresh enough to
// reuse. A Redis-backed store still wants this in-process gate so a
// single replica does not double-register against itself.
//
// Cross-consumer caveat (matters once issue #5145 sub-issue 4b (#5219)
// lands the CLI flow as the second consumer): because dcrFlight is
// package-global, two consumers that happen to construct identical Keys
// (same issuer, same redirect URI, same scopes hash) will share a single
// in-flight registration even if they semantically want different
// client profiles. The current call sites do not collide — the embedded
// authserver's redirect URI lives on the AS origin, the CLI flow's lives
// on a loopback — but a future consumer that defaults its redirect URI
// into either of those spaces would silently coalesce.
//
// TODO(#5219): the flight key must gain a consumer-identifier component
// when 4b wires the second consumer. Today the colliding-Key risk is
// theoretical; once two profiles share this group it becomes a
// correctness hazard the resolver itself must defend against. Track
// resolution against the 4b PR's design discussion.
var dcrFlight singleflight.Group
// flightKeyOf canonicalises a Key into the singleflight string used by
// dcrFlight. The "\n" separator is safe because newline is not a valid byte
// in any of the three components: URI reference characters in Issuer and
// RedirectURI (RFC 3986 §2), and hex digits in ScopesHash (the form
// storage.ScopesHash always emits). Exposed as a function so tests and
// future inspection helpers can compute the exact key the resolver would
// route through dcrFlight without re-implementing the concatenation.
func flightKeyOf(key Key) string {
return key.Issuer + "\n" + key.RedirectURI + "\n" + key.ScopesHash
}
// defaultUpstreamRedirectPath is the redirect path derived from the issuer
// origin when the caller's run-config does not supply an explicit RedirectURI.
// Matches the authserver's public callback route.
const defaultUpstreamRedirectPath = "/oauth/callback"
// authMethodPreference is the preferred order of token_endpoint_auth_methods,
// most preferred first. The resolver intersects this list with the server's
// advertised methods and picks the first match.
//
// Rationale: private_key_jwt is cryptographically strongest (asymmetric, no
// shared secret on the wire). client_secret_basic and client_secret_post are
// equally secure in transit but basic is marginally preferred because the
// credentials do not appear in request-body logs. "none" is the fallback for
// public PKCE clients.
var authMethodPreference = []string{
"private_key_jwt",
"client_secret_basic",
"client_secret_post",
"none",
}
// Resolution captures the full RFC 7591 + RFC 7592 response for a
// successful Dynamic Client Registration, together with the endpoints the
// upstream advertises so the caller need not re-discover them.
//
// The struct is the unit of storage in CredentialStore and the unit of
// application via ConsumeResolution.
//
// MUST update both converters (resolutionToCredentials and
// credentialsToResolution in store.go) when adding, renaming, or
// removing a field here. The two converters are the seam between this
// dcr-package type and the persisted *storage.DCRCredentials shape; a
// field added here without a paired converter update will silently fail
// to round-trip across an authserver restart, the exact "parallel types
// drift" failure mode .claude/rules/go-style.md warns about. The
// round-trip behaviour is pinned by TestResolutionCredentialsRoundTrip
// in store_test.go.
type Resolution struct {
// ClientID is the RFC 7591 "client_id" returned by the authorization
// server.
ClientID string
// ClientSecret is the RFC 7591 "client_secret" returned by the
// authorization server. Empty for public PKCE clients.
ClientSecret string
// AuthorizationEndpoint is the discovered (or configured) authorization
// endpoint for this upstream.
AuthorizationEndpoint string
// TokenEndpoint is the discovered (or configured) token endpoint for this
// upstream.
TokenEndpoint string
// RegistrationAccessToken is the RFC 7592 "registration_access_token"
// required for subsequent registration management operations (update,
// read, delete).
RegistrationAccessToken string
// RegistrationClientURI is the RFC 7592 "registration_client_uri" for
// registration management operations.
RegistrationClientURI string
// TokenEndpointAuthMethod is the authentication method negotiated at the
// token endpoint for this client.
TokenEndpointAuthMethod string
// RedirectURI is the redirect URI presented to the authorization server
// during registration. When the caller's run-config did not specify one,
// this holds the defaulted value derived from the issuer + /oauth/callback
// (via resolveUpstreamRedirectURI). Persisting it on the resolution lets
// ConsumeResolution write it back onto the run-config COPY so that
// downstream consumers (buildPureOAuth2Config, upstream.OAuth2Config
// validation) see a non-empty RedirectURI.
RedirectURI string
// ClientIDIssuedAt is the RFC 7591 §3.2.1 "client_id_issued_at" value
// converted to a Go time.Time. Zero when the upstream omitted the field
// (the field is OPTIONAL per RFC 7591). Informational; not used to
// invalidate the cache.
ClientIDIssuedAt time.Time
// ClientSecretExpiresAt is the RFC 7591 §3.2.1 "client_secret_expires_at"
// value converted to a Go time.Time. The wire convention is that 0 means
// "the secret does not expire"; in this struct that is represented by
// the zero time.Time so callers can use IsZero() rather than special-
// casing 0.
//
// When non-zero, this field is the authoritative signal that
// lookupCachedResolution uses to refetch credentials before the upstream
// rejects them at the token endpoint. The 90-day dcrStaleAgeThreshold
// is a heuristic for "consider rotating"; this is a hard expiry asserted
// by the upstream itself.
ClientSecretExpiresAt time.Time
// CreatedAt is the wall-clock time at which the resolution was completed.
// Used by Step 2g observability to compute staleness against
// dcrStaleAgeThreshold.
CreatedAt time.Time
}
// NeedsDCR reports whether rc requires runtime Dynamic Client Registration.
// A run-config needs DCR exactly when ClientID is empty and DCRConfig is
// non-nil (the mutually-exclusive constraint is enforced by
// OAuth2UpstreamRunConfig.Validate; this helper is a convenience check).
func NeedsDCR(rc *authserver.OAuth2UpstreamRunConfig) bool {
if rc == nil {
return false
}
return rc.ClientID == "" && rc.DCRConfig != nil
}
// ConsumeResolution returns a copy of rc with the resolved credentials and
// endpoints from res copied in and DCRConfig consumed (set to nil),
// transitioning the run-config from "DCR-pending" (ClientID == "" &&
// DCRConfig != nil) to "DCR-resolved" (ClientID populated && DCRConfig
// == nil). The "consume" name is deliberate: a second call on the
// returned value is a no-op only because the first cleared DCRConfig —
// this is a one-shot state transition, not an idempotent default-fill.
//
// rc is taken by value and the modified copy is returned. The caller's
// original is never observably mutated; the value-in / value-out shape
// makes the no-mutation contract compile-time enforced rather than a
// prose discipline the caller is required to remember. Pointer-typed
// fields (DCRConfig) share storage with the caller's copy via the struct
// shallow-copy, but the only mutation here is to assign nil to the
// copy's DCRConfig — nil-assignment to the local field does not reach
// back through the original pointer.
//
// Why DCRConfig is cleared: OAuth2UpstreamRunConfig.Validate enforces
// ClientID xor DCRConfig — a resolved copy that left DCRConfig set would
// fail the validator that runs downstream in buildPureOAuth2Config.
//
// ClientID, the endpoints, and RedirectURI are written only when rc leaves
// them empty — explicit caller configuration always wins. The conditional
// ClientID write is defence-in-depth against future call sites that bypass
// the resolver's validateResolveInputs precondition (which enforces
// ClientID == "" up front); an unconditional overwrite would silently
// clobber a pre-provisioned ClientID with no error.
//
// The defaulted RedirectURI write closes the loop on resolver-side defaulting:
// when the caller's run-config left RedirectURI empty, resolveUpstreamRedirectURI
// derived issuer + /oauth/callback and persisted it on the resolution; copying
// it back here means the downstream upstream.OAuth2Config has a non-empty
// RedirectURI, which authserver.Config validation requires.
//
// Note on ClientSecret: ConsumeResolution does NOT write the resolved
// secret because OAuth2UpstreamRunConfig models secrets as file-or-env
// references only. To propagate the DCR-resolved secret into the final
// upstream.OAuth2Config, callers must pair this call with
// ApplyResolutionToOAuth2Config once the config has been built. Keeping
// the two helpers side-by-side localises the DCR-specific application
// logic.
func ConsumeResolution(rc authserver.OAuth2UpstreamRunConfig, res *Resolution) authserver.OAuth2UpstreamRunConfig {
if res == nil {
return rc
}
if rc.ClientID == "" {
rc.ClientID = res.ClientID
}
rc.DCRConfig = nil
if rc.AuthorizationEndpoint == "" {
rc.AuthorizationEndpoint = res.AuthorizationEndpoint
}
if rc.TokenEndpoint == "" {
rc.TokenEndpoint = res.TokenEndpoint
}
if rc.RedirectURI == "" {
rc.RedirectURI = res.RedirectURI
}
return rc
}
// ApplyResolutionToOAuth2Config returns a copy of cfg with the DCR-
// resolved ClientSecret overlaid onto it. This is the companion to
// ConsumeResolution: where that function writes fields representable in
// the file-or-env run-config model, this one writes the inline-only
// ClientSecret directly on the runtime config.
//
// cfg is taken by value and the modified copy is returned, mirroring
// ConsumeResolution. The no-mutation contract is compile-time enforced
// by the signature rather than a prose discipline.
//
// The split between these two helpers exists because buildPureOAuth2Config
// intentionally retains a narrow file-or-env contract (no DCR awareness)
// and because OAuth2's ClientSecret on the run-config is modelled as a
// reference rather than an inline string. Any future output path from
// OAuth2UpstreamRunConfig to upstream.OAuth2Config must call BOTH
// ConsumeResolution (run-config side) AND ApplyResolutionToOAuth2Config
// (built-config side) to get a fully-resolved DCR client. Forgetting the
// second call leaves ClientSecret empty and produces silent auth failures
// at request time — the type system does not enforce the pair, so the
// invariant lives here.
func ApplyResolutionToOAuth2Config(cfg upstream.OAuth2Config, res *Resolution) upstream.OAuth2Config {
if res == nil {
return cfg
}
cfg.ClientSecret = res.ClientSecret
return cfg
}
// Step identifiers for structured error logs emitted by the caller of
// ResolveCredentials. These values flow through the "step" attribute so
// operators can narrow failures to a specific phase without parsing error
// messages. They are reported only at the boundary log — see
// dcrStepError — so a single failure produces a single slog.Error record.
const (
dcrStepValidate = "validate"
dcrStepResolveRedirect = "resolve_redirect_uri"
dcrStepCacheRead = "cache_read"
dcrStepMetadata = "metadata_discovery"
dcrStepSelectAuthMethod = "select_auth_method"
dcrStepRegister = "dcr_call"
dcrStepCacheWrite = "cache_write"
)
// dcrStepError annotates a resolver error with the phase it was produced
// in. The boundary caller (buildUpstreamConfigs) emits the single
// slog.Error record for the failure; individual error branches inside
// ResolveCredentials do not log so that each failure surfaces exactly
// once in the combined log stream.
//
// RedirectURI is included when known so that operators can correlate the
// failure with a specific upstream registration without parsing the
// wrapped error string. Stack carries a captured stack trace for the
// dcrStepRegister panic-recovery branch so LogStepError can include
// it in the single boundary record without the in-defer site emitting
// its own duplicate slog.Error. A zero-value dcrStepError is invalid;
// construct via newDCRStepError or the resolver's internal helpers.
type dcrStepError struct {
Step string
Issuer string
RedirectURI string
Stack string
Err error
}
// Error implements error. The "step" tag mirrors the structured-log
// attribute so command-line log scrapers see the same phase identifier.
func (e *dcrStepError) Error() string {
return fmt.Sprintf("dcr: %s: %s", e.Step, e.Err.Error())
}
// Unwrap lets errors.Is / errors.As reach the wrapped cause.
func (e *dcrStepError) Unwrap() error { return e.Err }
// newDCRStepError builds a dcrStepError. It never returns nil for a
// non-nil cause.
func newDCRStepError(step, issuer, redirectURI string, err error) *dcrStepError {
return &dcrStepError{
Step: step,
Issuer: issuer,
RedirectURI: redirectURI,
Err: err,
}
}
// ResolveCredentials performs Dynamic Client Registration for rc against
// the upstream authorization server identified by rc.DCRConfig, caching the
// resulting credentials in cache. On cache hit the resolver returns
// immediately without any network I/O.
//
// rc must have ClientID == "" and DCRConfig != nil — the caller is expected
// to have validated this via OAuth2UpstreamRunConfig.Validate.
//
// localIssuer is *this* auth server's issuer identifier, NOT the upstream's.
// It is used to key the cache and to default the redirect URI to
// {localIssuer}/oauth/callback when rc.RedirectURI is empty. The upstream's
// issuer is recovered separately from rc.DCRConfig.DiscoveryURL inside the
// resolver and is used solely for RFC 8414 §3.3 metadata verification.
// Passing the upstream's issuer here would produce a wrong-origin default
// redirect and a cache key that does not identify the auth-server context.
//
// The caller is responsible for applying the returned resolution onto a COPY
// of rc via ConsumeResolution (per the copy-before-mutate rule). This function
// neither mutates rc nor the cache on failure.
//
// Observability: this function never calls slog.Error directly — all
// failures are annotated with a *dcrStepError and returned to the caller,
// which is expected to emit the boundary Error record. This avoids the
// double-logging pattern where both the resolver and the outer frame
// report the same failure. Cache-hit Debug / stale-Warn logs and the
// successful-registration Debug log are emitted here because they have no
// outer-frame equivalent. No secret values (client_secret,
// registration_access_token, initial_access_token) are ever logged — only
// public metadata such as client_id and redirect_uri.
func ResolveCredentials(
ctx context.Context,
rc *authserver.OAuth2UpstreamRunConfig,
localIssuer string,
cache CredentialStore,
) (*Resolution, error) {
if err := validateResolveInputs(rc, localIssuer, cache); err != nil {
return nil, newDCRStepError(dcrStepValidate, localIssuer, "", err)
}
redirectURI, err := resolveUpstreamRedirectURI(rc.RedirectURI, localIssuer)
if err != nil {
return nil, newDCRStepError(dcrStepResolveRedirect, localIssuer, "",
fmt.Errorf("resolve redirect uri: %w", err))
}
scopes := slices.Clone(rc.Scopes)
key := Key{
Issuer: localIssuer,
RedirectURI: redirectURI,
ScopesHash: storage.ScopesHash(scopes),
}
// Cache lookup short-circuits before any network I/O.
if cached, hit, err := lookupCachedResolution(ctx, cache, key, localIssuer, redirectURI); err != nil {
return nil, newDCRStepError(dcrStepCacheRead, localIssuer, redirectURI, err)
} else if hit {
return cached, nil
}
// Coalesce concurrent registrations for the same Key — see dcrFlight
// doc comment. The leader runs the registerOnce closure; followers
// receive the leader's *Resolution result. The flight key embeds the
// Key fields with a separator that cannot appear in any of them
// (newline is not valid in OAuth scope tokens, URLs, or hex digests).
//
// A defer/recover inside the closure converts a panic in registerAndCache
// (or anything it calls) into a normal error. Without this, singleflight
// re-panics the leader's panic in every follower — N concurrent callers
// for the same Key would all crash with the same value. The panic is
// still surfaced: the captured stack trace is attached to the wrapped
// dcrStepError and surfaces in the single boundary log emitted by
// LogStepError, so the failure produces exactly one Error record (no
// in-defer log here) and callers can react to it as a normal failure.
flightKey := flightKeyOf(key)
resolutionAny, err, _ := dcrFlight.Do(flightKey, func() (res any, err error) {
defer func() {
if r := recover(); r != nil {
stepErr := newDCRStepError(dcrStepRegister, localIssuer, redirectURI,
fmt.Errorf("registration panicked: %v", r))
stepErr.Stack = string(debug.Stack())
err = stepErr
res = nil
}
}()
return registerAndCache(ctx, rc, localIssuer, redirectURI, scopes, key, cache)
})
if err != nil {
return nil, err
}
return resolutionAny.(*Resolution), nil
}
// registerAndCache is the leader-only body of ResolveCredentials wrapped
// by the singleflight. It rechecks the cache before any network I/O so
// followers that arrive after the leader's Put returns immediately see the
// fresh entry on a subsequent call. Endpoint resolution, registration, and
// the durable Put live here.
func registerAndCache(
ctx context.Context,
rc *authserver.OAuth2UpstreamRunConfig,
localIssuer, redirectURI string,
scopes []string,
key Key,
cache CredentialStore,
) (*Resolution, error) {
// Recheck cache: another flight that just finished may have populated
// it between our initial lookup and our singleflight entry.
if cached, hit, err := lookupCachedResolution(ctx, cache, key, localIssuer, redirectURI); err != nil {
return nil, newDCRStepError(dcrStepCacheRead, localIssuer, redirectURI, err)
} else if hit {
return cached, nil
}
// Endpoint resolution: discover metadata when configured, otherwise use
// the caller-supplied RegistrationEndpoint directly. The upstream's
// expected issuer is recovered from cfg.DiscoveryURL inside the helper.
// localIssuer here is *this* auth server's issuer — correct for cache
// keying and redirect URI defaulting, but it must not be used for
// RFC 8414 §3.3 metadata verification (which is the upstream's concern).
endpoints, err := resolveDCREndpoints(ctx, rc.DCRConfig)
if err != nil {
return nil, newDCRStepError(dcrStepMetadata, localIssuer, redirectURI, err)
}
applyExplicitEndpointOverrides(endpoints, rc)
// Token-endpoint auth method: intersect server support with our
// preference order; default to client_secret_basic if the server does
// not advertise the field at all.
authMethod, err := selectTokenEndpointAuthMethod(
endpoints.tokenEndpointAuthMethodsSupported,
endpoints.codeChallengeMethodsSupported,
)
if err != nil {
return nil, newDCRStepError(dcrStepSelectAuthMethod, localIssuer, redirectURI, err)
}
registrationScopes := chooseRegistrationScopes(scopes, endpoints.scopesSupported, localIssuer)
response, err := performRegistration(ctx, rc.DCRConfig, endpoints.registrationEndpoint,
redirectURI, authMethod, registrationScopes)
if err != nil {
return nil, newDCRStepError(dcrStepRegister, localIssuer, redirectURI, err)
}
resolution := buildResolution(response, endpoints, authMethod, redirectURI)
// Write to durable storage before returning the resolution so a Put
// failure leaves no in-memory state diverging from the cache: the
// next call simply re-resolves rather than reading a value the cache
// never saw.
if err := cache.Put(ctx, key, resolution); err != nil {
return nil, newDCRStepError(dcrStepCacheWrite, localIssuer, redirectURI,
fmt.Errorf("cache put: %w", err))
}
//nolint:gosec // G706: client_id is public metadata per RFC 7591.
slog.Debug("dcr: registered new client",
"local_issuer", localIssuer,
"redirect_uri", redirectURI,
"client_id", resolution.ClientID,
)
return resolution, nil
}
// LogStepError emits the single boundary slog.Error record for a DCR
// resolver failure, carrying the step / issuer / redirect_uri attributes
// extracted from err. If err is not a *dcrStepError, it is logged with a
// generic "unknown" step — ResolveCredentials always wraps its errors,
// so this branch indicates a programming error in a future caller rather
// than a runtime condition. err == nil is a no-op so this function is
// safe to call without an outer guard.
//
// Every wrapped error is passed through SanitizeErrorForLog to strip URL
// query parameters that could plausibly contain sensitive tokens (defense
// in depth — the current DCR flow sends the initial access token as an
// Authorization header, not a query parameter, but nothing in the type
// system prevents a future refactor from doing otherwise).
func LogStepError(upstreamName string, err error) {
if err == nil {
return
}
var stepErr *dcrStepError
if !errors.As(err, &stepErr) {
slog.Error("dcr: resolve failed",
"upstream", upstreamName,
"step", "unknown",
"error", SanitizeErrorForLog(err),
)
return
}
attrs := []any{
"upstream", upstreamName,
"step", stepErr.Step,
"issuer", stepErr.Issuer,
"error", SanitizeErrorForLog(stepErr.Err),
}
if stepErr.RedirectURI != "" {
attrs = append(attrs, "redirect_uri", stepErr.RedirectURI)
}
if stepErr.Stack != "" {
attrs = append(attrs, "stack", stepErr.Stack)
}
slog.Error("dcr: resolve failed", attrs...)
}
// SanitizeErrorForLog strips secret-bearing components from any URLs
// embedded in err's message. The Go HTTP client, url.Error, and other
// net/* wrappers embed the full request URL — including userinfo,
// query, and fragment — in their error strings. Any of those can carry
// credentials or tokens (e.g. https://user:pass@host, ?token=…,
// implicit-flow callbacks #access_token=…); the current DCR flow does
// not embed any of them today, but stripping them here is defense in
// depth that protects the log regardless of future changes.
//
// Scheme, host, and path are preserved so operators retain enough
// context to correlate with upstream server logs. Trailing sentence
// punctuation adjacent to the URL (e.g. the comma in "reaching
// URL?q=1, retrying") is preserved — see trimURLTrailingPunctuation
// for the list of characters considered terminators.
//
// Scope: the regex matches http:// and https:// schemes
// (case-insensitively per RFC 3986 §3.1). Other schemes (file://, raw
// host:port) are not sanitised; the current DCR flow never embeds
// those in errors, and broadening the match risks false positives on
// unrelated text.
//
// IMPORTANT — caller responsibility: this function strips credentials
// only from http(s) URLs. Callers that may receive errors containing
// non-http(s) URLs with credential-bearing components (e.g.
// redis://user:pass@host, postgres://…, smtp://…) MUST verify those
// URLs are not credential-bearing before logging, or sanitise them
// separately. The function name reads generic but the implementation is
// scheme-specific by design — broadening the regex would risk false
// positives on prose. A future shared sanitiser covering more schemes
// is appropriate as a follow-up once a second non-http(s) call site
// appears.
func SanitizeErrorForLog(err error) string {
if err == nil {
return ""
}
msg := err.Error()
return queryStrippingPattern.ReplaceAllStringFunc(msg, func(match string) string {
// Split any trailing sentence punctuation off the match before
// handing it to url.Parse. Without this, a period / comma /
// closing bracket at the end of the sentence is absorbed into
// the URL's raw query and dropped along with the rest of the
// query component, mangling the error text. The trimmed
// punctuation is re-appended to the replacement so the
// surrounding prose is preserved verbatim.
core, tail := trimURLTrailingPunctuation(match)
u, parseErr := url.Parse(core)
if parseErr != nil {
return match
}
if u.User == nil && u.RawQuery == "" && u.Fragment == "" {
return match
}
u.User = nil
u.RawQuery = ""
u.Fragment = ""
return u.String() + tail
})
}
// trimURLTrailingPunctuation returns (core, tail) where tail is the run of
// trailing ASCII punctuation that commonly terminates a URL inside prose
// but is never a meaningful part of the URL itself. The characters chosen
// here mirror those used by general-purpose URL extractors (e.g.,
// Chromium's autolinker): sentence-ending punctuation, closing brackets,
// and a few separators that appear between URLs in freeform text.
//
// Note that ')' and ']' are stripped unconditionally — a URL legitimately
// containing a percent-encoded closing bracket will have it as "%29" or
// "%5D", not as a literal, so this cannot truncate a real URL path or
// query. The reverse case (an unescaped ')' inside a path) is
// non-conforming per RFC 3986 and out of scope for a log sanitiser.
func trimURLTrailingPunctuation(s string) (core, tail string) {
// terminators is intentionally ASCII-only; Unicode terminators (e.g.
// '」') are out of scope for this log sanitiser, so byte indexing is
// safe and avoids the rune-decoding overhead of strings.ContainsRune.
const terminators = ".,;:!?)]}>"
i := len(s)
for i > 0 && strings.IndexByte(terminators, s[i-1]) >= 0 {
i--
}
return s[:i], s[i:]
}
// queryStrippingPattern matches URL-shaped substrings inside an error
// message — sufficient to reach the url.Parse path in SanitizeErrorForLog
// and let it decide whether a secret-bearing component exists to strip.
// The regexp is intentionally narrow (http/https schemes only) to avoid
// false positives, but matches schemes case-insensitively per RFC 3986
// §3.1 since upstream metadata or user input can carry mixed-case
// schemes. Trailing sentence punctuation that the character class
// happens to include (e.g. '.', ',', ')') is stripped by
// trimURLTrailingPunctuation before the match is parsed.
var queryStrippingPattern = regexp.MustCompile(`(?i)https?://[^\s"']+`)
// -----------------------------------------------------------------------------
// Private helpers
// -----------------------------------------------------------------------------
// validateResolveInputs performs the defensive re-check of resolver
// preconditions. Validate() enforces most of these at config-load time, but
// ResolveCredentials is an entry point that programmatic callers can
// reach with partially-constructed run-configs.
func validateResolveInputs(
rc *authserver.OAuth2UpstreamRunConfig,
localIssuer string,
cache CredentialStore,
) error {
if rc == nil {
return fmt.Errorf("oauth2 upstream run-config is required")
}
if rc.ClientID != "" {
return fmt.Errorf("dcr: oauth2 upstream has a pre-provisioned client_id")
}
if rc.DCRConfig == nil {
return fmt.Errorf("dcr: oauth2 upstream has no dcr_config")
}
if localIssuer == "" {
return fmt.Errorf("dcr: issuer is required")
}
if cache == nil {
return fmt.Errorf("dcr: credential store is required")
}
return nil
}
// lookupCachedResolution checks the cache and logs the hit. On hit it
// returns (resolution, true, nil). On miss it returns (nil, false, nil). An
// error is returned only on backend failure.
//
// Two distinct staleness signals shape the hit/miss decision and the log:
//
// - Hard expiry (RFC 7591 §3.2.1 client_secret_expires_at): when the
// cached resolution's ClientSecretExpiresAt is non-zero and in the
// past, the entry is treated as a miss so the singleflight body
// (registerAndCache) re-runs the registration and overwrites the stale
// entry via cache.Put. Without this check the cache would serve an
// expired secret indefinitely; the upstream's token endpoint would 401
// on every use and the resolver would have no signal to refetch. The
// check is skipped when the field is zero, per the RFC 7591 convention
// "0 means the secret does not expire". This is the authoritative
// signal — the upstream said when its credential expires.
// - Soft staleness (now - CreatedAt vs dcrStaleAgeThreshold): the age in
// days is logged on every hit, and if it exceeds the threshold an
// additional slog.Warn is emitted with a remediation hint so operators
// can act on long-lived registrations that may need rotation or
// re-registration. This is observability only, not a cache-invalidation
// trigger.
func lookupCachedResolution(
ctx context.Context,
cache CredentialStore,
key Key,
localIssuer, redirectURI string,
) (*Resolution, bool, error) {
cached, ok, err := cache.Get(ctx, key)
if err != nil {
return nil, false, fmt.Errorf("dcr: cache lookup: %w", err)
}
if !ok {
return nil, false, nil
}
if !cached.ClientSecretExpiresAt.IsZero() && time.Now().After(cached.ClientSecretExpiresAt) {
//nolint:gosec // G706: client_id is public metadata per RFC 7591.
slog.Debug("dcr: cache hit ignored; cached secret expired per upstream client_secret_expires_at",
"local_issuer", localIssuer,
"redirect_uri", redirectURI,
"client_id", cached.ClientID,
"client_secret_expires_at", cached.ClientSecretExpiresAt.UTC().Format(time.RFC3339),
)
return nil, false, nil
}
age := time.Since(cached.CreatedAt)
ageDays := int(age / (24 * time.Hour))
//nolint:gosec // G706: client_id is public metadata per RFC 7591.
slog.Debug("dcr: cache hit",
"local_issuer", localIssuer,
"redirect_uri", redirectURI,
"client_id", cached.ClientID,
"dcr_age_days", ageDays,
)
if age > dcrStaleAgeThreshold {
//nolint:gosec // G706: client_id is public metadata per RFC 7591.
slog.Warn(
"dcr: cached registration exceeds staleness threshold; "+
"consider rotating the registration via RFC 7592 deregistration "+
"and re-registering at next startup",
"local_issuer", localIssuer,
"redirect_uri", redirectURI,
"client_id", cached.ClientID,
"dcr_age_days", ageDays,
"stale_threshold_days", int(dcrStaleAgeThreshold/(24*time.Hour)),
)
}
return cached, true, nil
}
// applyExplicitEndpointOverrides overwrites the discovered
// authorizationEndpoint / tokenEndpoint in endpoints with explicit values
// from rc when rc specifies them. Explicit caller configuration always wins
// over discovery.
func applyExplicitEndpointOverrides(endpoints *dcrEndpoints, rc *authserver.OAuth2UpstreamRunConfig) {
if rc.AuthorizationEndpoint != "" {
endpoints.authorizationEndpoint = rc.AuthorizationEndpoint
}
if rc.TokenEndpoint != "" {
endpoints.tokenEndpoint = rc.TokenEndpoint
}
}
// chooseRegistrationScopes selects the scopes to send in the registration
// request: explicit caller scopes > discovered scopes_supported > empty.
// Logs a warning when neither source produces any scopes.
func chooseRegistrationScopes(explicit, discovered []string, localIssuer string) []string {
if len(explicit) > 0 {
return explicit
}
if len(discovered) > 0 {
return discovered
}
slog.Warn("dcr: no scopes configured or discovered; registering with empty scope",
"local_issuer", localIssuer,
)
return nil
}
// performRegistration executes the HTTP registration request exactly once.
// The initial access token (if configured) is injected as a
// bearer-token Authorization header via a wrapping http.Client.
func performRegistration(
ctx context.Context,
dcrCfg *authserver.DCRUpstreamConfig,
registrationEndpoint, redirectURI, authMethod string,
scopes []string,
) (*oauthproto.DynamicClientRegistrationResponse, error) {
// Initial access token is optional; resolveSecret returns ("", nil)
// when neither file nor env var is configured.
initialAccessToken, err := resolveSecret(dcrCfg.InitialAccessTokenFile, dcrCfg.InitialAccessTokenEnvVar)
if err != nil {
return nil, fmt.Errorf("dcr: resolve initial access token: %w", err)
}
httpClient := newDCRHTTPClient(initialAccessToken)
request := &oauthproto.DynamicClientRegistrationRequest{
RedirectURIs: []string{redirectURI},
ClientName: oauthproto.ToolHiveMCPClientName,
TokenEndpointAuthMethod: authMethod,
GrantTypes: []string{oauthproto.GrantTypeAuthorizationCode, oauthproto.GrantTypeRefreshToken},
ResponseTypes: []string{oauthproto.ResponseTypeCode},
Scopes: scopes,
}
// Call exactly once — no retry loop. Step 2g will add retry/backoff at a
// higher layer if needed.
response, err := oauthproto.RegisterClientDynamically(ctx, registrationEndpoint, request, httpClient)
if err != nil {
return nil, fmt.Errorf("dcr: register client: %w", err)
}
return response, nil
}
// buildResolution assembles the Resolution from the RFC 7591 response and
// the resolved endpoints. If the server did not echo a
// token_endpoint_auth_method in the response, the method actually sent is
// recorded so downstream consumers see a definite value. redirectURI is the
// value passed to the registration endpoint (caller-supplied or defaulted
// via resolveUpstreamRedirectURI); it is persisted on the resolution so
// ConsumeResolution can propagate a defaulted value back to the run-config.
//
// RFC 7591 §3.2.1 client_id_issued_at and client_secret_expires_at are
// converted from int64 epoch seconds to time.Time. The wire value 0 means
// "field absent" or "secret does not expire"; both map to the zero time.Time
// so callers can use IsZero() uniformly.
func buildResolution(
response *oauthproto.DynamicClientRegistrationResponse,
endpoints *dcrEndpoints,
sentAuthMethod string,
redirectURI string,
) *Resolution {
authMethod := response.TokenEndpointAuthMethod
if authMethod == "" {
authMethod = sentAuthMethod
}
return &Resolution{
ClientID: response.ClientID,
ClientSecret: response.ClientSecret,
AuthorizationEndpoint: endpoints.authorizationEndpoint,
TokenEndpoint: endpoints.tokenEndpoint,
RegistrationAccessToken: response.RegistrationAccessToken,
RegistrationClientURI: response.RegistrationClientURI,
TokenEndpointAuthMethod: authMethod,
RedirectURI: redirectURI,
ClientIDIssuedAt: epochSecondsToTime(response.ClientIDIssuedAt),
ClientSecretExpiresAt: epochSecondsToTime(response.ClientSecretExpiresAt),
CreatedAt: time.Now(),
}
}
// epochSecondsToTime converts the int64 epoch-seconds form used by RFC 7591
// into a time.Time. Zero passes through to the zero time.Time so callers can
// rely on IsZero() to mean "field absent" / "does not expire".
func epochSecondsToTime(epoch int64) time.Time {
if epoch == 0 {
return time.Time{}
}
return time.Unix(epoch, 0).UTC()
}
// dcrEndpoints is the internal bundle of endpoints produced by endpoint
// resolution. The separation from Resolution lets the resolver reason
// about discovered vs. overridden values before committing to a resolution.
type dcrEndpoints struct {
authorizationEndpoint string
tokenEndpoint string
registrationEndpoint string
tokenEndpointAuthMethodsSupported []string
scopesSupported []string
// codeChallengeMethodsSupported is consumed by
// selectTokenEndpointAuthMethod to gate the public-client (none) auth
// method on S256 PKCE being advertised. RFC 7636 / OAuth 2.1 require
// PKCE-with-S256 for public clients; registering as none against an
// upstream that advertises only plain (or omits the field) would be a
// compliance gap.
codeChallengeMethodsSupported []string
}
// resolveDCREndpoints produces the endpoint bundle from the DCRUpstreamConfig.
//
// Three branches, in priority order:
//
// 1. cfg.RegistrationEndpoint set — use it directly and skip discovery
// entirely. Server-capability fields (token_endpoint_auth_methods_supported,
// scopes_supported) are unavailable on this branch; the caller is
// expected to also supply AuthorizationEndpoint, TokenEndpoint, and an
// explicit Scopes list. Auth method falls back to the
// selectTokenEndpointAuthMethod default.
// 2. cfg.DiscoveryURL set — fetch the exact document the operator
// configured (bypassing the well-known path fallback). RFC 8414 §3.3
// requires the metadata's "issuer" field to match the authorization
// server's issuer identifier; that identifier is the upstream's, not
// this auth server's, so it is recovered from the discovery URL via
// deriveExpectedIssuerFromDiscoveryURL rather than reusing the
// caller-supplied issuer (which names this auth server and is used
// elsewhere in ResolveCredentials for redirect URI defaulting and
// cache keying).
// 3. Neither set — defensive; Validate() rejects this configuration, but
// as a programmatic entry point the resolver returns an error rather
// than falling back to an unexpected strategy.
//
// When metadata is returned but omits registration_endpoint, the resolver
// synthesises {origin}/register — a convention used by nanobot and Hydra
// for providers that ship DCR without advertising it in discovery. Origin
// is taken from the upstream issuer, not this auth server's issuer, so the
// synthesised endpoint lands at the upstream.
func resolveDCREndpoints(
ctx context.Context,
cfg *authserver.DCRUpstreamConfig,
) (*dcrEndpoints, error) {
if cfg.RegistrationEndpoint != "" {
// Validate locally so a non-HTTPS or malformed URL fails before
// performRegistration constructs a bearer-token transport for it.
if err := validateUpstreamEndpointURL(cfg.RegistrationEndpoint, "registration_endpoint"); err != nil {
return nil, fmt.Errorf("dcr: %w", err)
}
return &dcrEndpoints{
registrationEndpoint: cfg.RegistrationEndpoint,
}, nil
}
if cfg.DiscoveryURL == "" {
return nil, fmt.Errorf(
"dcr: dcr_config must set either discovery_url or registration_endpoint")
}
upstreamIssuer, err := deriveExpectedIssuerFromDiscoveryURL(cfg.DiscoveryURL)
if err != nil {
return nil, err
}
metadata, err := oauthproto.FetchAuthorizationServerMetadataFromURL(ctx, cfg.DiscoveryURL, upstreamIssuer, nil)
return endpointsFromMetadata(metadata, err, upstreamIssuer)
}
// deriveExpectedIssuerFromDiscoveryURL recovers the issuer identifier the
// upstream is expected to claim in its RFC 8414 / OIDC Discovery document,
// given an operator-configured DiscoveryURL.
//
// Two recognised conventions:
//
// 1. Well-known suffix: the URL ends with /.well-known/oauth-authorization-server
// or /.well-known/openid-configuration. The suffix is stripped to recover
// the issuer; this covers single-tenant providers (e.g.
// https://mcp.atlassian.com/.well-known/oauth-authorization-server →
// https://mcp.atlassian.com) and the issuer-suffix multi-tenant style
// (e.g. https://idp.example.com/tenants/acme/.well-known/openid-configuration
// → https://idp.example.com/tenants/acme).
// 2. Non-well-known path: the URL points at a custom metadata endpoint that
// does not end in either suffix. Origin (scheme://host) is used as a
// best-effort fallback; this matches the common shape where the upstream
// issuer is the host root.
//
// RFC 8414 §3.1's path-aware form (well-known path inserted between host and
// tenant path, e.g. https://example.com/.well-known/oauth-authorization-server/tenant)
// is not auto-detected here — operators on that pattern can switch to
// dcr_config.registration_endpoint to bypass discovery.
func deriveExpectedIssuerFromDiscoveryURL(discoveryURL string) (string, error) {
const (
oauthSuffix = "/.well-known/oauth-authorization-server"
oidcSuffix = "/.well-known/openid-configuration"
)
u, err := url.Parse(discoveryURL)
if err != nil {
return "", fmt.Errorf("parse discovery url %q: %w", discoveryURL, err)
}