Skip to content

TT-6075: Update rate limit implementation#7941

Open
vladzabolotnyi wants to merge 76 commits intomasterfrom
feat/TT-6075/update-rate-limit-header-logic
Open

TT-6075: Update rate limit implementation#7941
vladzabolotnyi wants to merge 76 commits intomasterfrom
feat/TT-6075/update-rate-limit-header-logic

Conversation

@vladzabolotnyi
Copy link
Copy Markdown
Contributor

@vladzabolotnyi vladzabolotnyi commented Mar 31, 2026

Description

Update logic with handling rate limit headers data. Before we always set the data from quota. Now we're adding configuration to select the source for headers value.

Motivation and Context

  • Updated config with new fields to support context variables and RL headers source
  • Added new context variables for RL and Quotas
  • Added HeaderSender interface and quota, RL implementations to support polymorphism based on headers source config

How This Has Been Tested

  1. Unit tests

Demos

Given API with 3/15s RL and 5 Quotas && "ratelimit_headers_source": "quotas", "enable_redis_rolling_limiter": true

Screen.Recording.2026-04-02.at.16.49.37.mov

When client sends 3 request, Headers are present and reflect Quota state(3/5). After 3 requests the RL is hit and 429 code returned. Wait 15 secs, send 2 more requests to return 403 and headers are omitted. We don't include headers on error response to support backward compatibility. When rate limit is hit we can see 429 status code and for quotas limit we could see 403, that's why we have different status codes - rate limit is recovering but quotas has much longer delay to recover

Screen.Recording.2026-04-02.at.17.14.18.mov

Given API with 3/5s RL and 5 Quotas && "ratelimit_headers_source": "rate_limits", "enable_redis_rolling_limiter": true. Send 3 requests, hit 429 code, headers must be present reflecting current state. Wait till RL refresh, send 2 more requests to hit the quotas limit and 403 status code. Headers are present reflecting rate limit state.

Screen.Recording.2026-04-02.at.17.32.33.mov

Context variables attached to response headers

Screen.Recording.2026-04-03.at.13.22.29.mov

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Refactoring or add test (improvements in base code or adds test coverage to functionality)

Checklist

  • I ensured that the documentation is up to date
  • I explained why this PR updates go.mod in detail with reasoning why it's required
  • I would like a code coverage CI quality gate exception and have explained why

@probelabs
Copy link
Copy Markdown
Contributor

probelabs bot commented Mar 31, 2026

This pull request introduces a significant refactoring of the rate limit header implementation. It decouples the source of rate limit headers from the quota system, allowing for more accurate, real-time feedback based directly on the rate-limiting mechanism. A new configuration option, rate_limit_response_headers, is added to switch between the legacy quota-based headers and the new rate-limit-based headers. Additionally, the change enriches the request context with detailed rate limit and quota data, making it available to other middleware.

Files Changed Analysis

The changes span 30 files, with 1332 additions and 188 deletions, reflecting a substantial overhaul of the rate-limiting and response header logic.

  • Configuration (config/rate_limit.go, cli/linter/schema.json): A new global configuration, rate_limit_response_headers, is introduced, allowing the source for rate limit headers to be set to "quotas" (legacy behavior) or "rate_limits".
  • Core Logic (gateway/session_manager.go): This file sees the most significant changes. The ForwardMessage function is refactored to use a new rate.Checker interface for evaluating limits and a rate.HeaderSender interface to manage response headers, improving the separation of concerns.
  • New Abstractions (internal/rate/): A new internal/rate package is created to house the primary abstractions. This includes the HeaderSender interface with quotaSender and rateLimitSender implementations, forming a strategy pattern for header generation, and a Checker interface to standardize the processing of rate limit results.
  • Middleware & Response Handling (gateway/mw_*.go, gateway/reverse_proxy.go): Multiple middleware files and the reverse proxy are updated to use a new factory, gw.limitHeaderFactory, replacing the old sendRateLimitHeaders method for consistent header handling.
  • Context Variables (gateway/model.go): New constants (e.g., ctxDataKeyRateLimitLimit) and a helper function (ctxGetOrCreateData) are added to inject detailed rate limit and quota data into the request context when enable_context_vars is active.
  • Testing (*_test.go): Comprehensive unit and integration tests have been added to validate the new functionality, covering both header source modes across various rate limiter backends (Redis, Sentinel, DRL, etc.).

Architecture & Impact Assessment

  • What this PR accomplishes: It decouples rate limit header generation from the quota system, providing more accurate and standards-compliant feedback to API clients. It also enhances system extensibility by exposing internal rate limit and quota states as request context variables.
  • Key technical changes introduced:
    1. Configurable Header Source: A new global setting, rate_limit_response_headers, allows operators to choose between legacy quota-based headers and new, more accurate rate-limit-based headers.
    2. Strategy Pattern for Headers: A HeaderSenderFactory (rate.NewSenderFactory) is introduced to create the appropriate HeaderSender (quotaSender or rateLimitSender) based on the global configuration, cleanly separating the two strategies.
    3. Decoupled Logic & Early Header Injection: The SessionLimiter now injects rate limit headers early in the request lifecycle. This ensures that X-RateLimit-* headers are present even on blocked (429 Too Many Requests) responses, a critical improvement for API clients.
    4. Context Enrichment: If an API definition has enable_context_vars: true, the request context is populated with rate_limit_* and quota_* variables for use in other middleware.
  • Affected system components: The change primarily impacts the Gateway's core rate limiting and quota enforcement engine. It touches all middleware and response handlers that previously added rate limit headers, including the API-level rate limiter, per-key rate limiter, mock response handler, and caching middleware.

Flow Diagram

sequenceDiagram
    participant Client
    participant Gateway as Tyk Gateway
    participant SessionLimiter
    participant HeaderSender
    participant Upstream

    Client->>Gateway: Request
    Gateway->>SessionLimiter: ForwardMessage(req, session, headerSender)
    activate SessionLimiter

    SessionLimiter->>SessionLimiter: newRateLimitChecker().Check()
    note right of SessionLimiter: Determines if request is over limit and gets stats

    SessionLimiter->>HeaderSender: SendRateLimits(stats)
    activate HeaderSender
    note right of HeaderSender: In 'rate_limits' mode, sets X-RateLimit-* headers
    HeaderSender-->>Gateway: Headers are now on the response writer
    deactivate HeaderSender

    alt Rate Limit Exceeded
        SessionLimiter-->>Gateway: return sessionFailRateLimit
        Gateway-->>Client: 429 Too Many Requests (with headers)
    else Rate Limit OK
        SessionLimiter-->>Gateway: return sessionFailNone
        deactivate SessionLimiter
        Gateway->>Upstream: Forward Request
        Upstream-->>Gateway: Upstream Response
        Gateway->>HeaderSender: SendQuotas(session)
        note right of HeaderSender: In 'quotas' mode, sets headers here (legacy behavior)
        Gateway-->>Client: 200 OK (with final headers)
    end
Loading

Scope Discovery & Context Expansion

  • Broader Impact: This change significantly improves the experience for API consumers who rely on X-RateLimit-* headers for client-side throttling and retry logic, as the rate_limits mode provides more accurate and timely data. The new context variables also empower developers to build more advanced custom middleware that can react to the current state of the rate limiter and quota manager.
  • Further Exploration for Reviewers:
    • Core Abstractions: Examine the new internal/rate package, especially headers.go and checker.go, to understand the new HeaderSender and Checker interfaces.
    • Central Logic: Review the refactoring in gateway/session_manager.go, as ForwardMessage is the central point of the new logic.
    • Behavioral Tests: Check the new tests in gateway/mw_rate_limiting_test.go and gateway/mw_api_rate_limit_test.go to see the expected behavior for both quotas and rate_limits modes across different limiter implementations.
Metadata
  • Review Effort: 4 / 5
  • Primary Label: feature

Powered by Visor from Probelabs

Last updated: 2026-04-20T09:10:44.022Z | Triggered by: pr_updated | Commit: 93571d1

💡 TIP: You can chat with Visor using /visor ask <your question>

@probelabs
Copy link
Copy Markdown
Contributor

probelabs bot commented Mar 31, 2026

Security Issues (2)

Severity Location Issue
🟡 Warning gateway/session_manager.go:185
The `limitSentinel` function spawns a 'fire-and-forget' goroutine for every request to update the Redis rolling window counter. Under high traffic, this can lead to unbounded goroutine creation, causing excessive memory consumption, CPU scheduler contention, and potentially overwhelming the Redis server. This creates a denial-of-service vulnerability where the gateway can be crashed by a flood of requests.
💡 SuggestionReplace the unbounded 'fire-and-forget' goroutine with a bounded worker pool. Requests to update the rate limiter should be sent to a channel, which is processed by a fixed number of worker goroutines. This will throttle the updates to Redis and prevent the creation of an excessive number of goroutines, protecting the gateway from resource exhaustion.
🟡 Warning gateway/session_manager.go:369
The key for the in-memory rate-limiting bucket (`bucketKey`) is constructed using `session.LastUpdated`. Since `session.LastUpdated` can change on every request for certain configurations, a new in-memory bucket may be created for each request when using the default Distributed Rate Limiter (DRL) in single-node mode. This leads to unbounded memory growth as old buckets are never reused or garbage-collected, causing a memory leak that can result in a Denial of Service.
💡 SuggestionThe bucket key should not include transient data like `session.LastUpdated`. A stable key should be used, for example, by removing the `session.LastUpdated` component: `bucketKey := limiterKey`. If a unique bucket per session update cycle is truly required, a mechanism to expire and garbage-collect old, unused buckets must be implemented to prevent memory leaks.

Performance Issues (2)

Severity Location Issue
🔴 Critical gateway/session_manager.go:205-207
The `limitSentinel` function spawns a 'fire-and-forget' goroutine for every request to update the Redis rolling window counter. Under high traffic, this can lead to unbounded goroutine creation, causing excessive memory consumption, CPU scheduler contention, and potentially overwhelming the Redis server. This creates a denial-of-service vulnerability where the gateway can be crashed by a flood of requests.
💡 SuggestionReplace the "fire-and-forget" goroutine with a managed worker pool. A buffered channel can be used to dispatch update tasks to a fixed number of worker goroutines. This approach controls concurrency, prevents resource exhaustion, and allows for batching or other optimizations.
🔴 Critical gateway/session_manager.go:429-432
The key for the in-memory rate-limiting bucket (`bucketKey`) is constructed using `session.LastUpdated`. Since `session.LastUpdated` can change on every request for certain configurations, a new in-memory bucket may be created for each request when using the default Distributed Rate Limiter (DRL) in single-node mode. This leads to unbounded memory growth as old buckets are never reused or garbage-collected, causing a memory leak that can result in a Denial of Service.
💡 SuggestionThe bucket key must be deterministic and not based on a volatile, ever-changing value like `session.LastUpdated`. Use a stable identifier from the session, such as the session ID or the base `limiterKey` itself. If a time window is required, use a truncated timestamp (e.g., `time.Now().Unix() / int64(period)`), which remains constant for the duration of the rate limit window.

Security Issues (2)

Severity Location Issue
🟡 Warning gateway/session_manager.go:185
The `limitSentinel` function spawns a 'fire-and-forget' goroutine for every request to update the Redis rolling window counter. Under high traffic, this can lead to unbounded goroutine creation, causing excessive memory consumption, CPU scheduler contention, and potentially overwhelming the Redis server. This creates a denial-of-service vulnerability where the gateway can be crashed by a flood of requests.
💡 SuggestionReplace the unbounded 'fire-and-forget' goroutine with a bounded worker pool. Requests to update the rate limiter should be sent to a channel, which is processed by a fixed number of worker goroutines. This will throttle the updates to Redis and prevent the creation of an excessive number of goroutines, protecting the gateway from resource exhaustion.
🟡 Warning gateway/session_manager.go:369
The key for the in-memory rate-limiting bucket (`bucketKey`) is constructed using `session.LastUpdated`. Since `session.LastUpdated` can change on every request for certain configurations, a new in-memory bucket may be created for each request when using the default Distributed Rate Limiter (DRL) in single-node mode. This leads to unbounded memory growth as old buckets are never reused or garbage-collected, causing a memory leak that can result in a Denial of Service.
💡 SuggestionThe bucket key should not include transient data like `session.LastUpdated`. A stable key should be used, for example, by removing the `session.LastUpdated` component: `bucketKey := limiterKey`. If a unique bucket per session update cycle is truly required, a mechanism to expire and garbage-collect old, unused buckets must be implemented to prevent memory leaks.
\n\n \n\n

Performance Issues (2)

Severity Location Issue
🔴 Critical gateway/session_manager.go:205-207
The `limitSentinel` function spawns a 'fire-and-forget' goroutine for every request to update the Redis rolling window counter. Under high traffic, this can lead to unbounded goroutine creation, causing excessive memory consumption, CPU scheduler contention, and potentially overwhelming the Redis server. This creates a denial-of-service vulnerability where the gateway can be crashed by a flood of requests.
💡 SuggestionReplace the "fire-and-forget" goroutine with a managed worker pool. A buffered channel can be used to dispatch update tasks to a fixed number of worker goroutines. This approach controls concurrency, prevents resource exhaustion, and allows for batching or other optimizations.
🔴 Critical gateway/session_manager.go:429-432
The key for the in-memory rate-limiting bucket (`bucketKey`) is constructed using `session.LastUpdated`. Since `session.LastUpdated` can change on every request for certain configurations, a new in-memory bucket may be created for each request when using the default Distributed Rate Limiter (DRL) in single-node mode. This leads to unbounded memory growth as old buckets are never reused or garbage-collected, causing a memory leak that can result in a Denial of Service.
💡 SuggestionThe bucket key must be deterministic and not based on a volatile, ever-changing value like `session.LastUpdated`. Use a stable identifier from the session, such as the session ID or the base `limiterKey` itself. If a time window is required, use a truncated timestamp (e.g., `time.Now().Unix() / int64(period)`), which remains constant for the duration of the rate limit window.
\n\n ### Quality Issues (3)
Severity Location Issue
🟠 Error gateway/session_manager.go:479
The quota reset time, an `int64` value from `expiredAt.Unix()`, is cast to `int` before being passed to `extendContextWithQuota`. On 32-bit systems, where `int` is 32 bits, this will cause an integer overflow for dates beyond January 2038 (the Y2038 problem). This results in incorrect quota reset times being stored in context variables, leading to faulty behavior in middleware that relies on this data.
💡 SuggestionModify the `extendContextWithQuota` function to accept an `int64` for the reset time. Pass the value from `expiredAt.Unix()` without casting to prevent potential overflow on 32-bit architectures. The `valToStr` function, which consumes these context variables, already handles `int64`.
🟠 Error gateway/session_manager.go:606
The `resetTime` variable, which is an `int64` returned by `time.Now().Add(stats.Reset).Unix()`, is cast to an `int` before being stored in the request context data. On 32-bit systems, this cast can lead to an integer overflow for dates beyond the year 2038, causing incorrect rate limit reset times to be exposed as context variables.
💡 SuggestionStore the `resetTime` as an `int64` in the context map to ensure 64-bit precision and avoid the Y2038 problem on 32-bit platforms. The `valToStr` function used for processing context variables already supports `int64`, so this change should not require further modifications downstream.
🔧 Suggested Fix
data[ctxDataKeyRateLimitReset] = resetTime
🟡 Warning gateway/mw_api_rate_limit_test.go:19-99
The test logic in `TestAPIRateLimitResponseHeaders` is nearly identical to the logic in `TestRateLimitResponseHeaders` in `gateway/mw_rate_limiting_test.go`. Both functions iterate through the same set of rate limiters and perform similar request and response header checks. The primary difference is that one tests keyless (global) rate limits while the other tests per-key rate limits. This duplication makes the tests harder to maintain, as any change to the testing logic must be applied in two places.
💡 SuggestionRefactor the common test logic into a shared test helper function. This helper could accept parameters to configure the API for either keyless or per-key rate limiting and to determine whether an authorization header should be sent. This would eliminate significant code duplication and centralize the test logic, improving maintainability.

Powered by Visor from Probelabs

Last updated: 2026-04-20T09:10:17.575Z | Triggered by: pr_updated | Commit: 93571d1

💡 TIP: You can chat with Visor using /visor ask <your question>

@vladzabolotnyi
Copy link
Copy Markdown
Contributor Author

All sonarqube issues are related to constant initialization at test file with avoiding value duplication and cognitive complexity in legacy logic. This changes can be fixed but not in the scope of current ticket.

Copy link
Copy Markdown
Contributor

@andyo-tyk andyo-tyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggested tweak

Comment thread cli/linter/schema.json Outdated
Co-authored-by: andyo-tyk <99968932+andyo-tyk@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@andyo-tyk andyo-tyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go docs LGTM. This does not cover developer/engineer code review.

@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Passed Quality Gate passed

Issues
5 New issues
0 Accepted issues

Measures
0 Security Hotspots
85.7% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

@vladzabolotnyi vladzabolotnyi enabled auto-merge (squash) April 20, 2026 06:56
@github-actions
Copy link
Copy Markdown
Contributor

🚨 Jira Linter Failed

Commit: 93571d1
Failed at: 2026-04-20 09:08:34 UTC

The Jira linter failed to validate your PR. Please check the error details below:

🔍 Click to view error details
failed to get Jira issue: failed to fetch Jira issue TT-6075: Issue does not exist or you do not have permission to see it.: request failed. Please analyze the request body for more details. Status code: 404

Next Steps

  • Ensure your branch name contains a valid Jira ticket ID (e.g., ABC-123)
  • Verify your PR title matches the branch's Jira ticket ID
  • Check that the Jira ticket exists and is accessible

This comment will be automatically deleted once the linter passes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants