test(rate-limiter): align integration + load tests with framework refactor by gandhipratik203 · Pull Request #4635 · IBM/mcp-context-forge

gandhipratik203 · 2026-05-07T06:37:18Z

Summary

The cpex framework refactor on main changed two contracts the existing rate-limiter tests didn't track. As a result the tests passed nothing meaningful through the gateway HTTP path: integration tests hit HTTP 400 on tool calls, and the Make-driven load tests either no-op'd or 401'd before generating traffic. This PR brings them back in line with the current contract.

What broke and why

Contract change on main	Test failure mode
`POST /servers/<id>/mcp` now requires `Mcp-Session-Id` on every non-initialize call	Tool-path tests returned `HTTP 400 Missing session ID`; load test's `tools/list` auto-detect failed silently → `_tool_names` stayed empty → `@task call_tool` early-returned every tick
`GET /admin/plugins` reports framework's internal mode label (`sequential` / `transform` / `disabled`) instead of the operator label set via `PUT` (`enforce` / `permissive` / `disabled`)	One dynamic-behaviour assertion compared against the operator label
`.env.example`'s `JWT_SECRET_KEY` lengthened to satisfy the 32-byte minimum	Two locustfiles still defaulted to the old 11-char `"my-test-key"` → admin requests 401'd

Changes

tests/integration/test_rate_limiter_multi_tenant.py — add _mcp_initialize_session helper; _invoke_tool_once now does the initialize + initialized handshake and sends Mcp-Session-Id on the tool call.
tests/integration/test_rate_limiter_dynamic_behavior.py — same handshake helper added to _send_tool_burst; admin-API mode assertion relaxed to "reported mode is non-disabled" with a comment explaining the operator-vs-internal label mapping.
tests/loadtest/locustfile_rate_limiter_backend_correctness.py — _auto_detect runs the MCP initialize handshake before tools/list so _tool_names actually populates.
tests/loadtest/locustfile_rate_limiter_redis_capacity.py + locustfile_rate_limiter_scale.py — JWT_SECRET_KEY default updated to match .env.example.

Verification

Run	Result
`pytest tests/integration/test_rate_limiter_multi_tenant.py --with-integration`	✅ 2/2
`pytest tests/integration/test_rate_limiter_dynamic_behavior.py --with-integration`	✅ 5/5
`make benchmark-rate-limiter`	✅ tools-path enforcement under sustained load — Redis backend verdict
`make benchmark-rate-limiter-redis-capacity`	✅ prompt-path sustains concurrent load with 0 errors

Run against a docker-compose stack with the rate limiter enabled (mode: enforce in plugins/config.yaml) and REDIS_CONTAINER_NAME set to match the running compose project name. The mode: enforce flip is operator-side test setup, not a chart change — plugins/config.yaml is unchanged in this PR.

Out of scope

tests/integration/test_rate_limiter.py — 50/59 substantive tests still pass. The 9 failures are scaffolding bit-rot (renamed /api/v1/... routes, removed _store introspection attribute the Rust core no longer exposes). The plugin engine itself is well-covered by the passing tests; happy to address the bit-rot in a follow-up if it's worth a separate cleanup.
The dynamic plugin-bindings path (POST /v1/tools/plugin_bindings) — there's a separate observation that for RateLimiterPlugin specifically, binding config overrides aren't honored at runtime even though the same path works for OutputLengthGuardPlugin and SecretsDetection. Diagnosing that is its own thread; this PR keeps to the static-config test surface.

@task

…actor The cpex framework refactor on main changed two contracts the existing rate-limiter tests didn't track: 1. POST /servers/<id>/mcp now requires an Mcp-Session-Id header on every non-initialize call. Tests that hit the tool path silently returned HTTP 400; load tests that auto-detected tools via tools/list left their tool list empty and the @task no-op'd. 2. GET /admin/plugins now returns the framework's internal mode label ("sequential" / "transform" / "disabled") rather than the operator label ("enforce" / "permissive" / "disabled") set via PUT. One assertion compared against the operator label and broke. Plus a smaller drift fix in two locustfiles: the JWT_SECRET_KEY default ("my-test-key", 11 chars) didn't match the gateway's .env.example value ("my-test-key-but-now-longer-than-32-bytes"), so requests 401'd before generating any traffic. Changes - tests/integration/test_rate_limiter_multi_tenant.py: add _mcp_initialize_session helper; _invoke_tool_once now does the initialize + initialized handshake and sends Mcp-Session-Id. - tests/integration/test_rate_limiter_dynamic_behavior.py: same handshake helper added to _send_tool_burst; admin-API mode assertion relaxed to "reported mode is non-disabled" with a comment explaining the operator-vs-internal label mapping. - tests/loadtest/locustfile_rate_limiter_backend_correctness.py: _auto_detect runs the MCP initialize handshake before tools/list so _tool_names actually populates. - tests/loadtest/locustfile_rate_limiter_{redis_capacity,scale}.py: JWT_SECRET_KEY default updated to match main's .env.example value. Verified pytest tests/integration/test_rate_limiter_multi_tenant.py -- 2/2 pytest tests/integration/test_rate_limiter_dynamic_behavior.py -- 5/5 make benchmark-rate-limiter -- 119/120 blocked (98.3%) make benchmark-rate-limiter-redis-capacity -- 290 reqs, 0 errors Out of scope - tests/integration/test_rate_limiter.py: 50/59 still pass; 9 failures are scaffolding bit-rot (renamed /api/v1/... routes, removed _store attribute the Rust core no longer exposes). Substantive plugin behaviour is covered. Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>

msureshkumar88

Thanks for the clear problem statement and the summary table — the three contract breaks are well-diagnosed and the fix direction is right. A few issues need to be addressed before this lands.

Blocking

R1 — `_mcp_initialize_session` copy-pasted verbatim into two files

tests/integration/test_rate_limiter_dynamic_behavior.py and tests/integration/test_rate_limiter_multi_tenant.py contain near-identical implementations of _mcp_initialize_session. An equivalent _initialize_session already exists in tests/live_gateway/mcp/test_mcp_plugin_parity.py:170. Please extract this to tests/integration/helpers/ (or tests/integration/conftest.py) and import from both files — the next integration test that needs the handshake will copy-paste again otherwise.

R2 — Wrong protocol version `"2025-06-18"` — inconsistent with codebase canonical

The canonical protocol version across this repo is "2025-11-25" (set as MCP_PROTOCOL_VERSION in tests/live_gateway/mcp/test_mcp_plugin_parity.py:29 and tests/loadtest/locustfile_echo_delay.py:108). Both new integration helpers use "2025-06-18". Worse, locustfile_rate_limiter_backend_correctness.py now has two different versions in the same file — "2025-06-18" in the new _auto_detect block and "2024-11-05" in the existing call_tool task. Please use a shared MCP_PROTOCOL_VERSION = "2025-11-25" constant, import it everywhere, and explain in a comment if "2025-06-18" is intentional (e.g. backward-compat testing).

R3 — Mode assertion relaxed too far

# Before (correct contract, wrong label):
assert plugins[PLUGIN_NAME]["mode"] == "enforce"

# After (passes for any non-disabled state — too permissive):
assert reported_mode != "disabled"

The right fix is to make the operator→internal label mapping explicit:

OPERATOR_TO_INTERNAL_MODE = {"enforce": "sequential", "permissive": "transform", "disabled": "disabled"}
assert reported_mode == OPERATOR_TO_INTERNAL_MODE["enforce"], (
    f"Expected internal mode {OPERATOR_TO_INTERNAL_MODE['enforce']!r}, got {reported_mode!r}"
)

This gives a precise assertion, documents the mapping where it's used, and fails correctly if the framework relabels modes again.

P1 — `_invoke_tool_once` creates a new MCP session per call — semantic error for rate-limit tests

_invoke_tool_once runs the full initialize + initialized handshake before every single tool invocation. If the rate limiter keys on session ID or resets counters per session, each call in test_tool_invocation_creates_rate_limit_keys_in_redis and test_rate_limit_keys_carry_tenant_prefix_when_tool_is_team_owned operates in a fresh window rather than accumulating against the same client. The session should be established once in the server_and_tool fixture and passed in, not re-created per call.

S1 — Hardcoded fallback JWT secret still in source

JWT_SECRET_KEY = _cfg("JWT_SECRET_KEY", "my-test-key-but-now-longer-than-32-bytes")

The old key was wrong for a different reason; the fix still bakes a known secret into source. For load tests run against real infra a missing env var silently uses this string, and any JWT signed with it is trivially forgeable. Consider:

_raw = _cfg("JWT_SECRET_KEY", "")
if not _raw:
    raise RuntimeError("JWT_SECRET_KEY env var is required for load tests")
JWT_SECRET_KEY = _raw

Or at minimum add a clear # nosec annotation and a comment explaining this is a CI-only fallback that must never reach production.

L1/L2 — Silent error swallowing in both helpers

Both _mcp_initialize_session implementations:

except requests.RequestException:
    pass  # ← no log
return sid  # returned as valid even if notification step failed

The notifications/initialized failure is swallowed silently. A failed notification with a returned sid means subsequent calls may fail with opaque errors. Add at minimum:

except requests.RequestException as exc:
    logger.warning("MCP initialized notification failed for server %s: %s", server_id, exc)

Also: all four return None paths in both helpers produce no log output. The caller then reports status=0 or errors=count with nothing in the output to explain why. Add a logger.warning before each return None.

Non-blocking suggestions

R4 — return 0 is a confusing error sentinel

_invoke_tool_once returns 0 when sid is None. The caller asserts status == 200, giving AssertionError: assert 0 == 200 with no explanation. Use pytest.fail("MCP session handshake failed") or raise a descriptive exception.

R5 — notifications/initialized missing in _auto_detect

The locustfile _auto_detect gets the sid from the initialize response but never sends the notifications/initialized POST. The integration tests do send it. Worth aligning for protocol correctness.

T1 — 9 failing tests in test_rate_limiter.py

Acknowledged as out of scope — worth opening a follow-up issue and linking it here so it doesn't get lost.

T2 — No negative-path coverage for session handshake failure

No test exercises _mcp_initialize_session returning None — what _send_tool_burst reports in that case, and whether the test suite distinguishes "rate limited" from "couldn't connect at all".

T3 — No cross-tenant session replay test

After adding session handshake, worth adding a test that a session ID from tenant A cannot be used on tenant B's endpoint (isolation regression).

D1 — tests/AGENTS.md not updated

The session handshake is now a required protocol step for any integration test on the MCP tool path. Worth documenting there.

gandhipratik203 requested review from crivetimihai, kevalmahajan and madhav165 as code owners May 7, 2026 06:37

gandhipratik203 requested review from ja8zyjits and msureshkumar88 May 7, 2026 06:40

msureshkumar88 requested changes May 7, 2026

View reviewed changes

msureshkumar88 self-assigned this May 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(rate-limiter): align integration + load tests with framework refactor#4635

test(rate-limiter): align integration + load tests with framework refactor#4635
gandhipratik203 wants to merge 1 commit intomainfrom
test/rate-limiter-test-infra-refresh

gandhipratik203 commented May 7, 2026 •

edited

Loading

Uh oh!

msureshkumar88 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gandhipratik203 commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What broke and why

Changes

Verification

Out of scope

Uh oh!

msureshkumar88 left a comment

Choose a reason for hiding this comment

Blocking

R1 — _mcp_initialize_session copy-pasted verbatim into two files

R2 — Wrong protocol version "2025-06-18" — inconsistent with codebase canonical

R3 — Mode assertion relaxed too far

P1 — _invoke_tool_once creates a new MCP session per call — semantic error for rate-limit tests

S1 — Hardcoded fallback JWT secret still in source

L1/L2 — Silent error swallowing in both helpers

Non-blocking suggestions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gandhipratik203 commented May 7, 2026 •

edited

Loading

R1 — `_mcp_initialize_session` copy-pasted verbatim into two files

R2 — Wrong protocol version `"2025-06-18"` — inconsistent with codebase canonical

P1 — `_invoke_tool_once` creates a new MCP session per call — semantic error for rate-limit tests