feat(redis): instrument raw redis clients via octoredis chokepoint constructor#495
feat(redis): instrument raw redis clients via octoredis chokepoint constructor#495an9xyz wants to merge 2 commits into
Conversation
β¦nstructor octo-lib#104 exported redis.Instrument so raw *rd.Client instances can opt into the dependency="redis" latency hook. Wire it in at a single chokepoint rather than scattering Instrument calls across ~18 sites: - pkg/redis: add NewInstrumentedClient(cfg, overrides...) and InstrumentedClientFromOptions(opts) β build a raw client and Instrument it before returning, so its commands feed dependency="redis". Instrumentation happens at construction, before the client is shared (octo-lib's call-before-share contract). NewWithOptions (octoredis.Conn) routes through it too. - Migrate all ~18 raw rd.NewClient(octoredis.MustBuildOptions(...)) sites β rate limiters, OIDC state/bind/logout/locks, bot registry/provision, user/group/ space auth, integration/incomingwebhook/usersecret/opanalytics, health probe β plus the readinessRedisOptions path. These control-plane clients were previously blind to dependency="redis" (only pool stats covered them). - main.go: rlRedis is now instrumented; refresh the coverage comment. - go.mod: bump octo-lib to the merged #104 commit. Addresses the primary raw-client follow-up in octo-lib#96. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01HHTDHbMPAVkNvYdQkh2ts3
Dependency Changes DetectedThis PR modifies dependency files. Please review whether these changes are intentional. Changed files:
Maintainer checklist:
|
There was a problem hiding this comment.
Summary: The PR is in scope for octo-server and correctly centralizes raw Redis client instrumentation without changing the migrated clientsβ Redis options or call behavior.
π¬ Non-blocking
- π‘ Warning: pkg/redis/options.go uses octo-libβs exported
Instrumentfor every client created byInstrumentedClientFromOptions. That is fine for the current process-lifetime raw clients, butInstrumentretains clients in its idempotence guard, so this helper should not be used for short-lived/request-scoped clients. Consider documenting that constraint more explicitly, especially because pkg/redis/redis.go now routes the generalConnconstructor through it.
β Highlights
- The production raw Redis client sites now consistently go through
octoredis.NewInstrumentedClientorInstrumentedClientFromOptions. - The hook is installed before the clients are shared, matching the octo-lib contract.
- The new tests cover both constructor paths and pass without a live Redis.
- I verified
go build ./...,go test ./pkg/redis/..., andgo test -race ./pkg/redis/.... A broadergo test ./modules/common/...failed on local migration database state, not on this PRβs Redis instrumentation path.
mochashanyao
left a comment
There was a problem hiding this comment.
[Octo-Q Β· automated review]
Verdict: Approve β no blocking findings; notes below (data-flow traced).
octo-server PR#495 Review Report
Reviewer: Octo-Q (automated review)
PR: #495
Head SHA: af60fa5
Title: feat(redis): instrument raw redis clients via octoredis chokepoint constructor
1. Verification Summary
| Item | Status | Evidence |
|---|---|---|
| Chokepoint constructors correct | β | pkg/redis/options.go:67-79 β NewInstrumentedClient delegates to InstrumentedClientFromOptions; both call rd.NewClient then liboredis.Instrument before returning |
| All 18 sites migrated uniformly | β | Every rd.NewClient(octoredis.MustBuildOptions(...)) replaced with octoredis.NewInstrumentedClient(cfg, fn) or InstrumentedClientFromOptions(opts) β identical override semantics preserved |
| Zero remaining production bypass | β | grep 'rd\.NewClient|redis\.NewClient' finds only _test.go files + options.go:76 (the chokepoint itself) |
| Instrument-before-share contract honored | β | InstrumentedClientFromOptions calls Instrument(c) before return c β client not shared at instrumentation time |
| Idempotency guard functional | β | octo-lib Instrument checks instrumented map under mutex (instrument.go:126-135) β repeated calls are no-op |
| Conn wrapper correctly routed | β | pkg/redis/redis.go:34-36 β NewWithOptions now uses InstrumentedClientFromOptions(opts) instead of raw rd.NewClient(opts) |
| sync.Once patterns preserved | β | incomingwebhook/api.go, integration/api.go β sync.Once wrapping intact; instrumented client created once |
| Observer error isolation | β | octo-lib reportRedis wraps observer call in defer recover() (instrument.go:47) β observer panic cannot reach Redis caller |
| No sensitive data exposure | β | Observer receives only cmd.Name() (low-cardinality command name); no keys/args/scripts |
| Tests validate hook installation | β | instrument_test.go β both tests drive a command to dead address; WrapProcess fires regardless; observer receives "get" |
| go.mod bump correct | β | octo-lib bumped to 0c34e6f108c4 which includes the exported Instrument from octo-lib#104 |
2. Findings
No P0/P1 findings.
P2 β InstrumentedClientFromOptions does not defensive-copy *rd.Options
Diff-scope: pre-existing (octo-server's NewWithOptions had the same behavior before this PR; PR does not change it).
octo-lib's redis.NewWithOptions takes a shallow copy (local = *opts) before mutating MaxRetries. octo-server's InstrumentedClientFromOptions passes opts directly to rd.NewClient(opts) without copying. If a caller reused the same *rd.Options across multiple constructors, go-redis internal mutations could leak across.
Why P2 not P1: All current callers either use fresh options from MustBuildOptions (which allocates a new struct each call) or readinessRedisOptions (which also returns a fresh struct). No reuse pattern exists in practice. Purely a latent hardening concern.
Suggestion: Consider local := *opts at the top of InstrumentedClientFromOptions to match octo-lib's defensive pattern. Low priority.
P2 β Test nil-check asymmetry
Diff-scope: new (introduced by this PR's test file).
TestNewInstrumentedClientInstruments includes if c == nil { t.Fatal(...) } (line 37), but TestInstrumentedClientFromOptionsInstruments does not. Both constructors ultimately call rd.NewClient which never returns nil, so this is cosmetic only. Minor hygiene nit.
3. Data-Flow Traceback
Path A: Raw control-plane clients (~18 sites)
Call site (e.g. main.go:202)
β octoredis.NewInstrumentedClient(cfg, overrides...)
β octoredis.MustBuildOptions(cfg, overrides...) // fresh *rd.Options with TLS
β InstrumentedClientFromOptions(opts)
β rd.NewClient(opts) // creates *rd.Client
β liboredis.Instrument(c) // installs WrapProcess hook
β return c // client instrumented before share
At runtime, every command on these clients triggers WrapProcess β reportRedis(cmd.Name(), dur, err) β process-level RedisObserver (set by libredis.SetRedisObserver(metrics.ObserveRedisCmd) in main.go:248). Data flows correctly to dependency="redis" metric. β
Path B: Conn wrapper (pkg/redis/redis.go)
db.NewRedisFromConfig(cfg) or redis.New(addr, pass)
β octo-server's redis.NewWithOptions(opts)
β InstrumentedClientFromOptions(opts)
β rd.NewClient(opts) + Instrument(c)
β &Conn{client: instrumented_client}
Conn methods (Get/Set/Del/etc.) delegate to rc.client which is instrumented. Data flows correctly. β
Path C: ctx.GetRedisConn() (octo-lib shared Conn)
config.Context.NewRedisCache()
β octo-lib's redis.NewWithOptions(opts) // octo-lib internal
β rd.NewClient(&local) + wrapClient(client) // octo-lib's internal wrapping
This path is not touched by this PR β it was already instrumented by octo-lib's internal wrapClient. No change in behavior. β
Path D: Health probe
newDependencyReadinessChecker(ctx, db)
β readinessRedisOptions(cfg) // fresh *rd.Options via MustBuildOptions
β octoredis.InstrumentedClientFromOptions(opts)
β rd.NewClient(opts) + Instrument(c)
β stored in dependencyReadinessChecker.redisClient
Health check's pingRedis calls c.redisClient.WithContext(ctx).Ping(). Since Instrument(c) runs before WithContext clone, the cloned client inherits the process hook. Data flows correctly. β
4. Blindspot Checklist (R5 β security_sensitive PR)
C1 β Dual-path parity
N/A. This PR adds observability instrumentation; no create/remove, subscribe/unsubscribe, or authorization symmetry paths are involved. The change is purely additive (adding a timing hook) and does not modify any control-flow or permission boundaries.
C2 β Control-flow ordering / nested reuse + safety control non-canonical probing
Clear. The only control-flow ordering concern is WrapProcess hook stacking. Verified:
liboredis.Instrumentis idempotent viainstrumentedmap under mutex β double-call is no-op.InstrumentedClientFromOptionscallswrapClientdirectly (bypasses the map), but each call site creates a fresh client viard.NewClient, so no double-wrap occurs.WithContextclones inherit the installed hook (clone copies theprocessfield at clone time, after instrumentation).- No nested/compound wrapping patterns exist β each client is instrumented exactly once.
C3 β Authorization boundary β capability boundary
N/A. No authorization, permission, or capability changes. Pure observability.
C4 β Authorization lifecycle / container-member state cascade
N/A. No auth changes.
C5 β Build/note pass β runtime path correctness
Clear. No build artifacts, browser extensions, CLI tools, or relative URLs involved. This is a Go server-side change. The go build, go vet, go test -race, and golangci-lint results reported in the PR description are consistent with the code changes. Runtime path: WrapProcess installs at construction time, before any commands are issued β the hook is present when traffic starts.
C6 β Governance / policy / security document consistency
N/A. No documentation, policy, or governance changes.
5. Additional Observations
- Memory consideration:
liboredis.Instrumentregisters each client in a process-levelmap[*rd.Client]struct{}. All ~18 control-plane clients are long-lived singletons (created at startup via constructors orsync.Once), so this is not a memory leak. The godoc explicitly warns against usingInstrumentfor short-lived clients. - octo-lib Conn already covered:
ctx.GetRedisConn()goes through octo-lib's ownredis.NewWithOptionswhich internally callswrapClient. The shared data-plane Conn was already instrumented before this PR. - main.go comment refresh: The updated coverage comment (lines 241-246) accurately reflects the new state β control-plane clients are now instrumented via the chokepoint constructors.
6. Verdict
No P0 or P1 findings. Two P2 nits (options defensive copy, test nil-check asymmetry) β both are minor hygiene concerns with no production impact.
[Octo-Q] verdict: APPROVE β clean, mechanical observability improvement. All production redis clients now feed dependency="redis" through a single chokepoint. Instrument-before-share contract honored. Idempotency guard functional.
OctoBoooot
left a comment
There was a problem hiding this comment.
Review: feat(redis): instrument raw redis clients via octoredis chokepoint constructor (#495)
Verdict: Approve with comments β clean, uniform instrumentation refactor; the chokepoint claim and the pool-stats interaction both byte-verify. No blockers, no majors.
CI is red on check-sprint only (admin Sprint-field gate); Build β and Vet β pass, Test was still pending at review time. Posting COMMENT per the CI precondition β clean approve once Test goes green and the Sprint field is set.
Risk tier: high (pkg/wkhttp/ratelimit_helper.go β rate limiter). needs-human-review + size/M + dependencies-changed on the PR. Established author (100 merged, 0 closed-unmerged).
Verified β the load-bearing claims
- β
"No raw
rd.NewClientredis site remains β grep confirms" (the PR's central thesis) β independently grepped the whole repo. Zero raw redis-client constructions outsidepkg/redis. The remainingNewClient(hits are different libraries (apns2 / elastic / sts / oidc / smtp). The four files that import go-redis but aren't in the diff (bot_api/events.go,robot/api.go,incomingwebhook/ratelimit.go,pkg/metrics/pool.go) all usectx.GetRedisConn()(already-instrumented lib path) or receive a client by parameter β none bypass the chokepoint. - β
Pool-stats coverage is NOT traded away for command-timing β the subtle interaction:
rlRedisswitched toNewInstrumentedClient(returns*rd.Client), andmain.go:249-250still passes that same instance intoRegisterPoolCollectors. So the ratelimit client now feeds bothdependency="redis"timing AND pool stats β the instrumentation didn't displace the existing collector registration. - β
go.modis replace-free (the #463 local-replacetrap) β clean bump of octo-lib to the merged #104 commit (...cff4d7a48f55β...0c34e6f108c4), noreplace/file:///../directives. Tests pass against the bumped lib;go buildof the touched packages is clean. - β
Both constructors instrument before returning β
InstrumentedClientFromOptionscallsliboredis.Instrument(c)before handing the client back, satisfying octo-lib's call-before-share contract;NewWithOptions(theConnpath) now routes through it too, so wrapped and raw clients are consistent. - β
Test pins real behavior without live Redis β drives a GET at
127.0.0.1:1soWrapProcessfires and the observer seesget, proving the hook is installed (not just that the constructor returns non-nil). Both constructors covered.
Minor
pkg/redis/instrument_test.goβ see inline. The two constructors are tested, but the architectural invariant the PR rests on (no rawrd.NewClientoutside the chokepoint) isn't pinned by a source-guard test. The migration is correct today; a guard keeps a future site from silently reopening the blind spot β the exact regression Β§2 of the COMPREHENSION names.
Praise
- The chokepoint is the right shape, and the two-constructor split is deliberate β
NewInstrumentedClient(cfg, ...)for the common case andInstrumentedClientFromOptions(opts)for the few sites (health probe) that pre-build their ownOptions. Both funnel through oneInstrumentcall, so there's exactly one place instrumentation can be forgotten, and theConnpath was wired through it too rather than left as a second uninstrumented door. - Keeping
rlRedisregistered with the pool collector after the switch β easy to miss that swapping the constructor could have dropped the client out ofRegisterPoolCollectorsand silently lost pool stats. The same instance feeds both metrics; that's the careful version of this change. - The hook's safety envelope is stated and matches the dependency β panic-isolated, in-memory
Observeonly, low-cardinality (cmd.Name()only, no keys/args/scripts), idempotent + nil-safe per octo-lib#104. The cardinality note in particular is the thing that turns a metrics change into an incident if gotten wrong, and it's explicitly bounded.
Out of scope (informational)
- CI
check-sprintβ same admin Sprint-field gate as the rest of this cycle. - The follow-up chain (octo-lib#96 raw-client coverage) is referenced correctly; this PR is the server-side wiring of the merged #104 export, appropriately scoped.
yujiawei
left a comment
There was a problem hiding this comment.
Code Review β PR #495 (octo-server)
Scope reviewed: head SHA af60fa5317194cfee86f82fa75477716d5ccc65c, merge-base f438ff42. 22 files, +148/-44. This is a security-sensitive change (touches rate-limiters, OIDC token/state/bind stores, auth paths), so it was reviewed with extra care, including an independent build, race-enabled unit run, and cross-checks of the upstream octo-lib instrumentation contract.
1. Specification compliance
Spec: β
The stated goal β route every raw *redis.Client through a single instrumented chokepoint constructor β is fully and exactly implemented.
- Nothing missing. Every prior
rd.NewClient(octoredis.MustBuildOptions(...))production site is migrated tooctoredis.NewInstrumentedClient(...):main.go,pkg/wkhttp/ratelimit_helper.go, and modulesbot_api,bot_provision,common/health,group,incomingwebhook,integration,oidc(api/bind_store/logout_idtoken/state_store/sync_lock),opanalytics/etl_lock,space,user(Γ2),usersecret. A whole-repo scan for the oldrd.NewClient(...)pattern in non-test code returns zero remaining sites.health.gocorrectly usesInstrumentedClientFromOptionsfor its pre-built options, andpkg/redis/redis.go'sConnconstructor now also routes through the instrumented path. - Nothing extra. No new env vars, flags, config fields, or behavioral switches. Only two helper funcs (
NewInstrumentedClient,InstrumentedClientFromOptions), two tests, the call-site migrations, theocto-libbump, and a corrected coverage comment inmain.go. - No deviation. Each migrated client preserves its exact options (rate-limiters
MaxRetries=1/PoolSize=10; OIDC storesMaxRetries=3+ 3s dial/read/write; bot registry timeouts; health probe pool/timeouts). Theocto-libbump20260626β¦β20260628015025-0c34e6f108c4is the version that exposesInstrument;go.mod/go.sumverify clean and no stale version is referenced in code.
2. Code quality
Quality: Approved
Verified the load-bearing correctness and security properties:
- No double-instrumentation. The most important risk for this kind of "wrap everything" change is a client being timed twice (2Γ metrics, no runtime signal). It cannot happen here.
octo-lib's publicInstrument()is idempotent via a mutex-guarded global map and is the only pathoctoredisuses; the lib's internalwrapClient(used by the lib's ownNew/NewWithOptionsfor thectx.GetRedisConn()data plane) operates on freshly-constructed clients that never pass back throughInstrument(). The control-plane and data-plane instrumentation paths never converge on the same*rd.Client. - TLS not weakened.
NewInstrumentedClientbuilds options viaMustBuildOptions, which appliesTLSConfigfromcfg.DBand only then runs caller overrides (overrides cannot drop TLS). Behavior is identical to the pre-PRrd.NewClient(MustBuildOptions(...)). - No fail-open regression. The instrumentation hook (
WrapProcess) only observes timing and forwards the original error unchanged;redis.Nilis normalized for metrics only, not for the caller. Rate-limiter and OIDC fail-open/closed posture is unaffected. - No high-cardinality / sensitive leakage. The observer reports only the lowercased command name (e.g.
get,eval) β no keys, args, uids, or tokens β consistent with thedependency/op/statuslabel contract. - Lifecycle is sound.
Instrument()pins clients in a process-global map (never GC'd), so the lib mandates startup-singleton usage only. Every migrated call site qualifies: constructed once inNew()/Route()(run once at module setup) or behindsync.Once/mutex guards (SharedUIDRateLimiter, thesharedRateRedishelpers). No per-request or hot-loop construction. - Tests + build.
go build ./...passes;go vet ./pkg/redis/...clean; the newpkg/redis/instrument_test.gopasses under-race. The tests cleverly prove the hook fires even when the command fails (unreachable127.0.0.1:1), which validates the plumbing without a live Redis.
Non-blocking notes (P2 / nits)
- P2 (test hygiene):
modules/user/api_authcode_token_redis_test.go:29still constructs its client via the oldrd.NewClient(octoredis.MustBuildOptions(...))pattern rather than the new chokepoint. Test-only, no production impact, but migrating it keeps the "single chokepoint" invariant honest and prevents the old pattern from being copied forward. - P2 (observer restore in tests):
instrument_test.gocleanup setsSetRedisObserver(nil)rather than capturing/restoring the prior observer. Fine today (no package-level default observer), but capture-and-restore would be more robust if a suite-level observer is ever introduced. - Nit: the updated
main.gocoverage comment is accurate; consider a brief note thatRegisterPoolCollectorsstill only registers theratelimitpool by name (pre-existing; pool metrics for the other singletons are not individually registered, though command-latency now covers them).
3. Overall verdict
APPROVE. Spec is fully met with no scope creep, and the implementation preserves TLS, error semantics, and lifecycle correctness with no double-counting. Only P2/nit items remain, none blocking merge.
4. Items a human may wish to spot-check (security-sensitive)
- Confirm in a staging environment that
dependency="redis"metrics for an OIDC/rate-limit path show a single command count per request (sanity check against the double-counting class of bug, even though static analysis rules it out). - The
Instrumentglobal map intentionally retains client references for the process lifetime β acceptable given all sites are startup singletons; worth keeping in mind if any future caller constructs instrumented clients dynamically.
Address #495 review feedback in one batch: - Source guard (OctoBoooot): TestNoRawRedisClientOutsideChokepoint walks the repo and fails if any non-test production file outside pkg/redis constructs a redis client via raw rd.NewClient/redis.NewClient. Pins the PR's core invariant β a future site can't silently reopen the dependency="redis" blind spot β the same way Test*NoLegacyResponseError pins the i18n invariant. - Defensive copy (Octo-Q): InstrumentedClientFromOptions now copies *rd.Options before handing it to go-redis (which mutates defaults in place), matching octo-lib redis.NewWithOptions; protects against caller reuse of the same opts. - Migrate the last old-pattern site (yujiawei): a user test still used rd.NewClient(octoredis.MustBuildOptions(...)); route it through the chokepoint too so the invariant holds in tests as well. - Test nil-check symmetry (Octo-Q nit). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01HHTDHbMPAVkNvYdQkh2ts3
8f7bce0
OctoBoooot
left a comment
There was a problem hiding this comment.
Review: feat(redis): instrument raw redis clients via octoredis chokepoint constructor (#495) β delta @ 8f7bce0
Verdict: Approve β the delta pins the chokepoint invariant my prior minor asked for and folds in both peer nits. No blockers, no majors. CI check-sprint now passes (the only thing that held my prior verdict to COMMENT); Build/Test/Vet/Lint pending. Upgrading COMMENT β APPROVE.
Delta from prior COMMENT @ af60fa53: 1 commit β "test(redis): pin the chokepoint invariant + review nits".
Resolved β byte-verified
- β Source-guard test added (my minor) β
chokepoint_guard_test.gowalks the repo and fails ifrd.NewClient(/redis.NewClient(appears in any non-test.gooutsidepkg/redis. I adversarially confirmed it's not a no-op: planted ard.NewClientin a tempmodules/zzfaketest/fake.goβ the test FAILS with the violation path; removed it β passes. Correctly skips the chokepoint dir +_test.go, and uses\bsooctoredis.NewClient(if ever added) won't false-match. This is the invariant the PR's whole thesis rests on, now regression-proof β modeled on the existing i18nNoLegacyResponseErrorguard. - β Defensive copy in
InstrumentedClientFromOptions(mochashanyao π‘) βlocal := *opts; rd.NewClient(&local). Correct: go-redis writes its defaults into the value fields, so the shallow copy prevents mutating a caller-reused*rd.Options; the shared*tls.Configis only read, not mutated, so shallow suffices. The comment cites parity with the lib'sNewWithOptionshandling. - β nil-check consistency (mochashanyao π‘) β both
Instrumenttests now assertc == nilbefore use. - β One more test site migrated β
api_authcode_token_redis_test.goswitched its rawrd.NewClientβNewInstrumentedClient, which the guard doesn't even require (it excludes_test.go). Consistency beyond what's enforced.
Full pkg/redis suite + build of touched packages pass.
Praise
- The guard test is the right way to make "single chokepoint" durable. A prose claim that no raw client remains is true at one SHA; a source-walking test makes it true at every future SHA, and it fails loudly with the exact file path and the remediation in the message. Verified it actually catches a violation rather than just passing on the clean tree β which is the difference between a guard and a decoration.
- The defensive copy matches the lib's own contract instead of inventing a new one. Pointing the comment at
octo-lib redis.NewWithOptions's identical handling means the two constructors now have the same options-aliasing semantics β no surprise where one mutates the caller's struct and the other doesn't.
Out of scope (informational)
- Build/Test/Vet/Lint pending at review time; the delta is a self-contained guard test + a one-line defensive copy + test nits.
pkg/redisis green locally. - Minor scope note (not a finding): the guard excludes
_test.go, so a test could still build an uninstrumented raw client β acceptable, since tests don't emit production metrics, and the author migrated the one such site anyway.
There was a problem hiding this comment.
The PR is in scope and correctly routes Redis control-plane clients through an instrumentation chokepoint.
π΄ Blocking: None.
π¬ Non-blocking:
- π‘ Warning: pkg/redis/options.go uses
liboredis.Instrument, whose idempotency guard retains client references globally. Current migrated call sites are process-lived, so this is acceptable, but future short-lived uses of this public constructor could leak clients. Consider documenting βstartup/singleton clients onlyβ more explicitly or exposing a no-retain constructor from the library later. - π΅ Suggestion: pkg/redis/redis.go still mutates caller-owned
opts.MaxRetriesbefore the new copying constructor runs. This was existing behavior, but it now differs from the updated library wrapperβs copy-before-default pattern. Copying first would make the wrapper semantics cleaner. - π΅ Suggestion: pkg/redis/chokepoint_guard_test.go only catches
rd.NewClientandredis.NewClientaliases. That matches current code, but an AST/import-path based guard would be harder to bypass accidentally.
β Highlights:
- The relevance gate passes: Redis observability is part of this backendβs runtime behavior.
- The central constructors preserve Redis config/TLS flow and install hooks before returning clients.
- The hook-install tests cover both new construction paths without requiring a live Redis.
go test ./pkg/redis/...passed.go test ./pkg/redis/... ./modules/common/...failed inmodules/commondue an existing local test database migration-state issue, not this patch path.
OctoBoooot
left a comment
There was a problem hiding this comment.
Concurring with @Jerry-Xin's π‘#2 β byte-verified, and it corrects an overstatement in my APPROVE. The defensive copy is real but one frame too late on the Conn path:
InstrumentedClientFromOptionsnow doeslocal := *optsbeforerd.NewClientβ.- But
NewWithOptions(redis.go:31-33) mutates the caller's struct in place βopts.MaxRetries = 3β before callingInstrumentedClientFromOptions. So a caller passing a shared*rd.OptionstoNewWithOptionsstill getsMaxRetrieswritten under them; the copy guards the go-redis-internal writes but not this pre-copy mutation.
I praised the copy as giving "the same options-aliasing semantics" as the lib path β that's not accurate for NewWithOptions, which still mutates the input. My mistake.
Reachability today: both callers pass a freshly-built Options (&rd.Options{...} literal; MustBuildOptions(cfg) allocates fresh), and MaxRetries=3 is benign β so no live bug, hence π‘ not a defect. Fix is one line: copy first, then default β local := *opts; if local.MaxRetries == 0 { local.MaxRetries = 3 }; return &Conn{client: InstrumentedClientFromOptions(&local)} (and InstrumentedClientFromOptions can keep its own copy or take the pre-copied value).
π‘#1 (document Instrument call-before-share / client lifecycle) and π‘#3 (harden the guard with import-path/AST matching instead of the \b(rd|redis)\.NewClient regex, which a differently-aliased import could slip past) are both fair non-blocking improvements. Verdict unchanged: APPROVE stands.
yujiawei
left a comment
There was a problem hiding this comment.
Code Review β PR #495 (octo-server)
Reviewed at head SHA 8f7bce0af322de3723e716f1f88e5f41fb227b75 against merge-base f438ff42. Build (go build ./...), go vet ./pkg/redis/..., and the new pkg/redis tests all pass locally. This is the follow-up that closes the dependency="redis" coverage gap left by the earlier dependency-metrics work β the ~15 bare rd.NewClient(...) control-plane clients (rate limiters, OIDC stores/locks, bot registry, auth, health probe, ETL lock) were previously invisible to the redis latency metric.
1. Spec compliance
Spec: β
- Missing work: none. Two chokepoint constructors are added (
NewInstrumentedClient,InstrumentedClientFromOptions), every previously-barerd.NewClient(octoredis.MustBuildOptions(...))site is routed through them, a source guard test pins the invariant, two tests prove the hook actually fires, and theocto-libdependency is bumped to the version exposingInstrument(). - Over-build: none. No new flags, fields, or endpoints beyond the stated goal.
- Deviation: none. TLS is still funnelled through
BuildOptions; instrumentation happens at construction time, before the client is shared β satisfyingocto-lib's "instrument before share/clone" contract. The oneWithContext(ctx)clone (health readiness probe,modules/common/health.go:175) is correct because the original client is instrumented at construction and go-redis v6clone()copies the wrappedprocessfield.
The instrument tests are the key strength here: they assert a real GET reaches the observer rather than just asserting the constructor returns non-nil, so this PR demonstrably does what its commit message claims.
2. Code quality
Quality: Approved
No P0/P1 issues. The defensive shallow-copy in InstrumentedClientFromOptions (pkg/redis/options.go:78) is correct β rd.NewClient calls opt.init() which mutates the options in place, so copying before construction is the right call and matches octo-lib's own pattern.
The following are non-blocking P2 suggestions (consider for a follow-up; none should hold up merge):
-
P2 β exported
Connpath can pin clients in a process-global map.pkg/redis/redis.go:36now routesNewWithOptionsthroughInstrumentedClientFromOptions, which calls the publicliboredis.Instrument(). That registers the client inocto-lib's never-evictedinstrumentedmap. octo-lib deliberately avoids this for its ownConnby using the internalwrapClient. This is harmless today:octoredis.NewWithOptions/octoredis.New/pkg/db.NewRedis*have zero production callers (verified across the repo), and all 19 bare-client sites are startup singletons (theSharedUIDRateLimiterpath is additionally guarded by a once-flag, so its 22 call sites build a single client). It only becomes a leak if a future caller uses the exportedConnconstructor for short-lived clients. Worth a doc-comment onNewWithOptionswarning it's for process-lifetime clients only. -
P2 β
NewWithOptionsdoesn't fully mirror octo-lib's copy/nil semantics. It mutates the caller's*rd.Options(MaxRetries) before the copy and panics on a nilopts, whereas octo-lib copies first and tolerates nil. The newInstrumentedClientFromOptionsitself also dereferencesoptswithout a nil check (pkg/redis/options.go:78). Aif opts == nil { return nil }guard (or explicit panic message) would make the exported helper friendlier. Low impact since current callers always pass non-nil. -
P2 β source guard has evasion blind spots.
pkg/redis/chokepoint_guard_test.go:23matches onlyrd.NewClient(/redis.NewClient(. A differently-aliased import (rdb.NewClient, dot import) or a different constructor (NewFailoverClient,NewClusterClient,NewRing,NewUniversalClient) would silently bypass both the guard and the instrumentation. None of those constructors exist in the repo today, so this is forward-looking hardening only β broaden the regex (e.g.\.New(Client|FailoverClient|ClusterClient|Ring|UniversalClient)\() if Sentinel/Cluster support is ever added.
3. Security notes (security-sensitive PR)
The instrumented clients sit on auth-adjacent paths (login/OIDC rate limiters, OIDC state/bind/logout stores, token setter). The change is observe-only: WrapProcess times the command and forwards to the observer without altering the returned error, and redis.Nil is normalised to non-error so cache misses don't pollute error rates. No behavioural change to rate-limiting or auth logic, no secrets touched, TLS config preserved through BuildOptions. Nothing here requires manual security sign-off beyond confirming the metric wiring, which the tests cover.
4. Verdict
Spec β and Quality Approved β APPROVE. The three P2s are polish, not blockers.
mochashanyao
left a comment
There was a problem hiding this comment.
[Octo-Q Β· automated review]
Verdict: Approve β no blocking findings; notes below (data-flow traced).
octo-server PR#495 Review Report
Reviewer: Octo-Q (automated review)
PR: #495
Head SHA: 8f7bce0af322de3723e716f1f88e5f41fb227b75
Base: f438ff42 (main)
Scope: 24 files, +235 / -46
1. Summary
PR introduces octoredis.NewInstrumentedClient and octoredis.InstrumentedClientFromOptions as a chokepoint constructor for raw *rd.Client instances, ensuring all ~15 previously-uninstrumented call sites feed into dependency="redis" metrics. Mechanical replacement of rd.NewClient(octoredis.MustBuildOptions(...)) β octoredis.NewInstrumentedClient(...). Adds a source-code guard test (TestNoRawRedisClientOutsideChokepoint) and instrumentation smoke tests.
2. Verification Conclusions
| Item | Status | Evidence |
|---|---|---|
liboredis.Instrument exists in pinned octo-lib version |
β | octo-lib@v0.0.0-20260628015025-0c34e6f108c4/pkg/redis/instrument.go β full implementation with mutex-guarded idempotence map |
Instrument is idempotent (no double-wrap) |
β | instrumentedMu sync.Mutex + instrumented = map[*rd.Client]struct{}{} β checked before wrapClient; TestInstrumentIdempotent confirms |
No double-instrumentation on ctx.GetRedisConn() path |
β | octo-lib's Conn creates its own *rd.Client via internal NewWithOptions β Instrument; octo-server's Conn creates independent *rd.Client via InstrumentedClientFromOptions β two separate client instances, never cross-wrapped |
Shallow copy (local := *opts) is safe for go-redis v6 |
β | go-redis v6 NewClient only mutates value-type fields via opt.init(); no pointer/slice mutation. Consistent with octo-lib's identical pattern |
SetRedisObserver is process-global atomic singleton |
β | atomic.Pointer[RedisObserver] β Store/Load, panic-recovery in reportRedis, tested |
| All 15+ call sites migrated consistently | β | Each call site: rd.NewClient(octoredis.MustBuildOptions(cfg, func(o *rd.Options){...})) β octoredis.NewInstrumentedClient(cfg, func(o *rd.Options){...}). Same overrides preserved. |
Only one rd.NewClient remains in production code (inside chokepoint) |
β | pkg/redis/options.go:79 β correctly excluded by guard test |
| Guard test regex covers known aliases | β | `\b(rd |
InstrumentedClientFromOptions used correctly in health.go |
β | readinessRedisOptions returns a fresh *rd.Options from MustBuildOptions β no shared pointer risk |
NewWithOptions in pkg/redis/redis.go routes through chokepoint |
β | Creates independent *rd.Client, single wrap. Production callers: dead code only (pkg/db/redis.go has 0 callers) |
3. Findings
P2 β NewWithOptions mutates caller's *rd.Options before copy
File: pkg/redis/redis.go:36-38
Severity: P2 (maintainability / side-effect hygiene)
Diff-scope: pre-existing behavior, unchanged by this PR (was already mutating via direct rd.NewClient(opts))
func NewWithOptions(opts *rd.Options) *Conn {
if opts.MaxRetries == 0 {
opts.MaxRetries = 3 // mutates caller's struct
}
return &Conn{client: InstrumentedClientFromOptions(opts)} // copies here
}InstrumentedClientFromOptions does a defensive copy (local := *opts), but opts.MaxRetries = 3 runs on the original struct before the copy. octo-lib's equivalent copies first, then mutates. Currently not reachable in production (the only caller path pkg/db/redis.go has 0 callers), but if future code reuses the same *rd.Options, MaxRetries would be silently changed.
Suggestion: Move the mutation after the copy:
func NewWithOptions(opts *rd.Options) *Conn {
local := *opts
if local.MaxRetries == 0 {
local.MaxRetries = 3
}
return &Conn{client: InstrumentedClientFromOptions(&local)}
}P2 β Guard test regex does not cover struct-literal rd.Client{} construction
File: pkg/redis/chokepoint_guard_test.go:32
Severity: P2 (test coverage gap, low risk)
Diff-scope: new code (the guard itself is introduced by this PR)
The regex \b(rd|redis)\.NewClient\( catches the standard constructor path but would miss &rd.Client{} or rd.Options{...} literal construction. In practice go-redis v6 doesn't expose a useful zero-value Client, so the bypass risk is theoretical. But if someone wraps &rd.Client{} for testing or edge cases, the guard won't catch it.
Suggestion: Consider adding &\s*(rd|redis)\.Client\{ to the regex as defense-in-depth.
No P0 or P1 findings.
4. Data-Flow Trace
| Consumed Data | Upstream Source | Runtime Flow Verified |
|---|---|---|
*rd.Client returned by NewInstrumentedClient |
rd.NewClient(&local) inside InstrumentedClientFromOptions β liboredis.Instrument(c) wraps WrapProcess |
β
β WrapProcess installed before client is returned to caller; all subsequent commands flow through hook β reportRedis β SetRedisObserver callback |
*rd.Client returned by InstrumentedClientFromOptions (health probe) |
readinessRedisOptions β MustBuildOptions(cfg, overrides) returns fresh *rd.Options β defensive copy β rd.NewClient β Instrument |
β β single wrap, no shared pointer |
Observer callback (metrics.ObserveRedisCmd) |
Set once at main.go:248 via libredis.SetRedisObserver β stored in atomic.Pointer |
β
β every instrumented client's WrapProcess calls reportRedis which Loads the pointer; panic-recovery in reportRedis prevents observer crash from taking down Redis calls |
| TLS config in instrumented clients | MustBuildOptions β BuildOptions β liboredis.BuildTLSConfig when cfg.DB.RedisTLS == true β stored in opts.TLSConfig |
β
β defensive copy shares TLS pointer (read-only by go-redis v6 init()), TLS config correctly applied |
Conn wrapping in pkg/redis/redis.go |
NewWithOptions β InstrumentedClientFromOptions β fresh *rd.Client + single Instrument |
β
β independent from ctx.GetRedisConn()'s client; no double-wrap |
5. Blind-Point Checklist (R5)
C1 β Dual-path parity: N/A. No symmetric add/remove or subscribe/unsubscribe paths touched. All changes are parallel mechanical substitutions of the same pattern.
C2 β Control-flow ordering / nested reuse: Clear. Instrument is idempotent (mutex + map guard), so even if accidentally called twice on the same client, no double-wrap occurs. WrapProcess hook ordering: installed once at construction, before any commands flow. No nested or re-entrant call sites.
C3 β Authorization boundary β capability boundary: N/A. No permission/jail/tool exposure changes. This is a metrics instrumentation PR.
C4 β Authorization lifecycle / container-member state cascade: N/A. No auth changes.
C5 β Build β runtime path: Clear. The changes are pure Go source-level substitutions. No build artifacts, extensions, CLI tools, relative paths, or environment-dependent behavior introduced. The go.mod bump to octo-lib is a standard dependency update; the new Instrument / SetRedisObserver functions exist in the pinned version.
C6 β Governance / policy / security document self-consistency: N/A. No documentation or policy changes.
6. Cross-Round Blocker Recheck (R6)
N/A β first review of this PR.
[Octo-Q] verdict: APPROVE + This PR cleanly closes the dependency="redis" observability gap by routing all raw *rd.Client construction through an instrumented chokepoint. The implementation is mechanically correct: Instrument is idempotent, no double-instrumentation exists (verified by tracing both ctx.GetRedisConn() and bare-client paths), the shallow copy is safe for go-redis v6, and the guard test effectively prevents regression. Two P2 suggestions (pre-copy mutation in NewWithOptions, guard regex coverage for struct literals) are non-blocking improvements.
Summary
Closes the control-plane coverage gap in the
dependency="redis"latency metric. octo-lib#104 (merged) exportedredis.Instrument(*rd.Client); this PR wires it in at a single chokepoint so the ~18 rawrd.NewClient(...)instances β rate limiters, OIDC locks, auth, health, etc. β feeddependency="redis"instead of being blind to it (only pool stats covered them before).Related Issue
Addresses the primary raw-client follow-up in octo-lib#96. Builds on octo-lib#104 (merged) and octo-server #474.
Linked Spec
octo-lib#104 (
redis.Instrumentexport) + #442DependencyMetrics. Reviewers on #474 (yujiawei / OctoBoooot / Octo-Q) flagged the control-plane redis clients as an alerting blind spot; this is the fix.Changes
pkg/redis/options.goβ addNewInstrumentedClient(cfg, overrides...)andInstrumentedClientFromOptions(opts): build a raw*rd.Clientand callliboredis.Instrumentbefore returning, so commands feeddependency="redis". Instrumentation at construction satisfies octo-lib's call-before-share contract.pkg/redis/redis.goβ octoredis's ownConn(NewWithOptions) routes throughInstrumentedClientFromOptionstoo.main.go(rlRedis),pkg/wkhttp/ratelimit_helper.go,modules/{oidcΓ5, userΓ2, group, space, integration, incomingwebhook, usersecret, opanalytics, bot_api, bot_provision}, andmodules/common/health.go(thereadinessRedisOptionspath).main.goβ refresh the coverage comment (control-plane clients are now instrumented).go.modβ bump octo-lib to the merged feat(auth): add login.local_off switch to disable local-account loginΒ #104 commit.pkg/redis/instrument_test.goproves both constructors install the hook (drives a command at a dead address so the hook fires without a live Redis).Mechanical, uniform change: every site swaps
rd.NewClient(octoredis.MustBuildOptions(cfg, fn))βoctoredis.NewInstrumentedClient(cfg, fn). No behavior change beyond the added timing hook; pool params, TLS, and lifecycle are unchanged.Testing
COMPREHENSION
What it does to the load-bearing path β routes every raw control-plane redis client through a constructor that installs octo-lib's per-command timing hook at build time. At runtime each command on those clients now also invokes the observer β one
Observe(dur)into the existingDependencyMetrics. No change to command results, errors, pooling, or TLS.What could break β the hook runs on every command of the affected clients. It's panic-isolated in octo-lib (
defer recover()), does only an in-memoryObserve, andInstrumentis idempotent + nil-safe (octo-lib#104), so double-counting/leaks are guarded. The one contract to honor β instrument before the client is shared β is satisfied because the constructor instruments before returning the client. Labels stay low-cardinality (cmd.Name()only; no keys/args/scripts reach the observer). Risk if a future site bypasses the chokepoint and callsrd.NewClientdirectly β it'd be uninstrumented again (mitigated: no such sites remain;grepconfirms).How I know it works β
TestNewInstrumentedClientInstruments/TestInstrumentedClientFromOptionsInstrumentsregister an observer and confirm a command on a constructor-built client reaches it (using a dead-address command so the hook fires without a live Redis).go build/vet/-race test/golangci-lintall green;grepconfirms zero remainingrd.NewClient(octoredis.MustBuildOptions(...))sites.Checklist
π€ Generated with Claude Code
Generated by Claude Code