Skip to content

feat(redis_interface): instrument Redis commands with external service call event emission#13091

Draft
spritianeja03 wants to merge 13 commits into
mainfrom
capture-redis-calls
Draft

feat(redis_interface): instrument Redis commands with external service call event emission#13091
spritianeja03 wants to merge 13 commits into
mainfrom
capture-redis-calls

Conversation

@spritianeja03

@spritianeja03 spritianeja03 commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Type of Change

  • Bugfix
  • New feature
  • Enhancement
  • Refactoring
  • Dependency updates
  • Documentation
  • CI/CD

Description

This PR instruments every Redis command in both the fred and redis_rs backends with per-call ExternalServiceCall event emission, making Redis latency and errors visible through the same observability pipeline used for key-manager calls.

Key Changes

  1. crates/redis_interface/src/observability.rs (new)

    • observe() async helper wraps each Redis command future, records wall-clock latency, and emits an ExternalServiceCall event (service_name: "redis", endpoint: <command name>)
    • is_enabled() fast-path guard: when the emitter is a no-op (emission disabled), the request-ID lookup and event construction are skipped entirely — Redis is the hottest call path and the guard keeps overhead near zero
    • No event is emitted when REQUEST_ID is absent (background workers, drainer, scheduler) since there is no API request to correlate to
  2. crates/router_env/src/request_context.rs (new)

    • REQUEST_ID tokio task-local propagates the current request ID from the actix RequestIdentifier middleware into deeply nested async code without threading it through every call site
    • try_get() returns None outside a request scope (background work), which the observe() helper treats as "no emission needed"
  3. Both command modules updated (crates/redis_interface/src/module/fred/commands.rs, crates/redis_interface/src/module/redis_rs/commands.rs)

    • Every public async command method now wraps its body with the observed!() macro (sugar for observe(self, "CMD", { body }).await)

Additional Changes

  • This PR modifies the API contract
  • This PR modifies the database schema
  • This PR modifies application configuration/environment variables

Motivation and Context

Redis is called on virtually every payment and storage operation, yet its latency contribution is invisible — it appears as undifferentiated time in request spans. This PR makes each Redis command emit a discrete event carrying the command name, latency, success status, and request ID. Events flow through the same ExternalServiceCall Kafka topic introduced for key-manager instrumentation, and land in ClickHouse where per-command latency histograms, error rates, and per-merchant Redis usage patterns can be queried.

How did you test it?

  • Ran cargo test -p redis_interface — all existing tests pass
  • Verified the is_enabled() fast-path guard prevents any allocation when emission is disabled (no-op emitter returns false without acquiring the task-local)

Checklist

  • I formatted the code cargo +nightly fmt --all
  • I addressed lints thrown by cargo clippy
  • I reviewed the submitted code
  • I added unit tests for my changes where possible

Closes #13090

@spritianeja03 spritianeja03 requested review from a team as code owners June 30, 2026 12:40
@semanticdiff-com

semanticdiff-com Bot commented Jun 30, 2026

Copy link
Copy Markdown

@spritianeja03 spritianeja03 self-assigned this Jun 30, 2026
@spritianeja03 spritianeja03 marked this pull request as draft June 30, 2026 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(observability): capture per-command Redis call latency and errors as observable events

1 participant