feat(redis_interface): instrument Redis commands with external service call event emission#13091
Draft
spritianeja03 wants to merge 13 commits into
Draft
feat(redis_interface): instrument Redis commands with external service call event emission#13091spritianeja03 wants to merge 13 commits into
spritianeja03 wants to merge 13 commits into
Conversation
…ission in dev configs
…ia track_redis_call
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Type of Change
Description
This PR instruments every Redis command in both the
fredandredis_rsbackends with per-callExternalServiceCallevent emission, making Redis latency and errors visible through the same observability pipeline used for key-manager calls.Key Changes
crates/redis_interface/src/observability.rs(new)observe()async helper wraps each Redis command future, records wall-clock latency, and emits anExternalServiceCallevent (service_name: "redis",endpoint: <command name>)is_enabled()fast-path guard: when the emitter is a no-op (emission disabled), the request-ID lookup and event construction are skipped entirely — Redis is the hottest call path and the guard keeps overhead near zeroREQUEST_IDis absent (background workers, drainer, scheduler) since there is no API request to correlate tocrates/router_env/src/request_context.rs(new)REQUEST_IDtokio task-local propagates the current request ID from the actixRequestIdentifiermiddleware into deeply nested async code without threading it through every call sitetry_get()returnsNoneoutside a request scope (background work), which theobserve()helper treats as "no emission needed"Both command modules updated (
crates/redis_interface/src/module/fred/commands.rs,crates/redis_interface/src/module/redis_rs/commands.rs)observed!()macro (sugar forobserve(self, "CMD", { body }).await)Additional Changes
Motivation and Context
Redis is called on virtually every payment and storage operation, yet its latency contribution is invisible — it appears as undifferentiated time in request spans. This PR makes each Redis command emit a discrete event carrying the command name, latency, success status, and request ID. Events flow through the same
ExternalServiceCallKafka topic introduced for key-manager instrumentation, and land in ClickHouse where per-command latency histograms, error rates, and per-merchant Redis usage patterns can be queried.How did you test it?
cargo test -p redis_interface— all existing tests passis_enabled()fast-path guard prevents any allocation when emission is disabled (no-op emitter returnsfalsewithout acquiring the task-local)Checklist
cargo +nightly fmt --allcargo clippyCloses #13090