Skip to content

Improve user-service availability-check latency under replay load#70

Merged
kenahrens merged 1 commit intomasterfrom
fix/user-service-check-availability-performance
Apr 7, 2026
Merged

Improve user-service availability-check latency under replay load#70
kenahrens merged 1 commit intomasterfrom
fix/user-service-check-availability-performance

Conversation

@kenahrens
Copy link
Copy Markdown
Member

Summary

This PR addresses the user-service latency hotspot around availability-check endpoints (/api/users/check-username, /api/users/check-email) observed during replay analysis.

Changes

  • add a short-lived in-memory cache (5s TTL) for usernameExists / emailExists checks to reduce repeated DB hits under bursty validation traffic
  • invalidate relevant cache entries after successful user registration
  • reduce log overhead on high-frequency availability endpoints by changing controller logs from info to debug
  • make service/security log levels environment-configurable with safer defaults:
    • com.banking.userservice: ${APP_LOG_LEVEL:INFO}
    • org.springframework.security: ${SECURITY_LOG_LEVEL:WARN}

Findings (Issue #69)

Replay source

  • backend/user-service/proxymock/recorded-complete

Environment comparison (before fix)

  • local replay showed /api/users/check-username around ~42ms
  • staging replay (do-nyc1-staging-decoy) showed /api/users/check-username timeout at ~10.01s with 100% failure for that endpoint in the sampled run

Local load comparison (before vs after this PR)

Replay command used for both runs:

  • proxymock replay --test-against localhost:<port> --in proxymock/recorded-complete --times 30 --vus 5
Metric Before (master) After (this PR)
/api/users/check-username avg 4.33 ms 3.69 ms
/api/users/check-username p95 6.99 ms 6.00 ms
total requests 750 750
replay failures 0% 0%

Validation

  • make test (user-service)
  • local replay load benchmark before/after as listed above

Issue

@kenahrens kenahrens merged commit 0380cee into master Apr 7, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Research: user-service high cpu and high latency

1 participant