Skip to content

Feat/llm cost user policy#1359

Merged
Krishanx92 merged 43 commits intowso2:mainfrom
Saadha123:Feat/llm-cost-user-policy
Mar 19, 2026
Merged

Feat/llm cost user policy#1359
Krishanx92 merged 43 commits intowso2:mainfrom
Saadha123:Feat/llm-cost-user-policy

Conversation

@Saadha123
Copy link
Copy Markdown
Contributor

@Saadha123 Saadha123 commented Mar 16, 2026

Purpose

LLM API consumers and platform teams need visibility into the monetary cost of each LLM call at the gateway level, without requiring application-side
instrumentation. This is a prerequisite for enforcing per-user or per-API spending budgets (e.g. "block after $10/hour").

Goals

  • Introduce the llm-cost system policy that calculates the USD cost of every LLM response and injects it as an x-llm-cost response header.
  • Introduce the llm-cost-based-ratelimit user policy that reads x-llm-cost and enforces configurable monetary budget limits (e.g. $5/hour, $100/day) using the
    existing advanced-ratelimit engine.
  • Provide a bundled pricing database (model_prices.json) covering OpenAI, Anthropic, Gemini (Google AI Studio + Vertex AI), and Mistral with support for
    context-window tiering, service tiers (priority/flex/batch), prompt caching, reasoning tokens, audio/image modality tokens, and web search tool costs.

Approach

llm-cost policy (response-phase, WASM):

  • Parses $.model / $.modelVersion from the upstream response body.
  • Looks up model pricing from a JSON file loaded once at route instantiation time, with progressive prefix-stripping fallback for versioned model IDs.
  • Runs a provider-specific normaliser (OpenAI, Anthropic, Gemini, Mistral) to extract a unified Usage struct from the raw response.
  • Applies a provider-agnostic cost calculator with tier-aware rate resolution (resolveRates), then provider-specific adjustments (Anthropic geo/speed multipliers,
    Gemini grounding fees).
  • Sets two response headers: x-llm-cost (USD to 10 decimal places) and x-llm-cost-status (calculated / not_calculated) to distinguish zero cost from a failed
    calculation.

llm-cost-based-ratelimit policy (request-phase, WASM):

  • Reads x-llm-cost from the previous response and maintains per-user sliding-window budget counters via the advanced-ratelimit engine (GCRA or fixed-window, memory
    or Redis backend).
  • Returns 429 Too Many Requests with x-ratelimit-cost-limit-dollars / x-ratelimit-cost-remaining-dollars headers when a budget is exhausted.

Both policies are published as Go modules under github.com/wso2/gateway-controllers/policies/ and referenced from build.yaml / build-lock.yaml.

Automation tests

  • Unit tests — gateway-controllers/policies/llm-cost (synced into dev-policies during development):
    • Pricing lookup (exact, prefix fallback, provider-prefix strip/prepend)
    • Per-provider Normalize and Adjust for all four providers
    • Context-window tier selection (base / >128k / >200k / >272k)
    • Service tier overrides (priority, flex, batch)
    • Cache token cost (read, 5-min write, 1-hr write, tiered rates, audio-specific cache rate)
    • Reasoning, audio input/output, image output modality costs
    • Anthropic geo/speed multipliers and cache carve-out, including geo value absent from provider_specific_entry
    • Gemini grounding cost (Google AI Studio per-query vs Vertex AI flat fee)
    • Web search cost (context-size tiers, SearchContextCostPerQuery precedence over WebSearchCostPerRequest, unmapped context size silently ignored)
    • x-llm-cost-status header on all success and failure paths
    • GetPolicy and loadPricingFromFile error handling
  • Integration tests — gateway/it/features/llm-cost.feature (18 scenarios) and gateway/it/features/llm-cost-combination.feature:
    • OpenAI, Anthropic, Gemini (native + caching + thinking), Mistral — cost calculation correctness
    • Anthropic geo+speed multiplier, 1-hr TTL cache writes, web search tool
    • OpenAI prompt caching, flex/priority/batch service tiers, reasoning tokens, web search tool
    • Unknown model → x-llm-cost-status: not_calculated, cost zero, request not blocked
    • Missing model/modelVersion field → x-llm-cost-status: not_calculated
    • RestApi kind exclusion (policy must not fire on non-LLM APIs)
    • Combined llm-cost + llm-cost-based-ratelimit: budget exhausted after real cost accumulates

Security checks

  • Followed secure coding standards: yes
  • Ran FindSecurityBugs plugin: N/A (Go WASM policy, no Java)
  • No keys, passwords, tokens, or secrets committed: yes

Samples

N/A

Related PRs

  • Corresponding policy implementation PR in gateway-controllers (llm-cost + llm-cost-based-ratelimit source)

Summary by CodeRabbit

  • New Features

    • Per-call LLM cost reporting via x-llm-cost header and a cost-based rate limiting policy (supports OpenAI, Anthropic, Gemini, Mistral, custom models); new default policies added.
  • Configuration

    • New LLM pricing configuration (pricing_file) with packaged pricing assets available at runtime.
  • Tests

    • Large BDD/integration suites added for cost calculations, budget enforcement, multi-window behavior, resets, and policy snapshot sync.
  • Chores

    • Runtime/build updated to include pricing assets; expanded mock LLM endpoints for testing.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 16, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds llm-cost and llm-cost-based-ratelimit policies, pricing configuration and assets, runtime/build changes to include llm-pricing, many mock LLM OpenAPI paths, extensive integration tests for cost and budget enforcement, and small test-step and formatting tweaks.

Changes

Cohort / File(s) Summary
Build & Ignore
\.gitignore, gateway/build.yaml, gateway/build-lock.yaml
Added scripts/verify-llm-cost.py to .gitignore; registered llm-cost and llm-cost-based-ratelimit policies in build configs and lockfile.
Policy Definitions
gateway/gateway-controller/default-policies/llm-cost.yaml, gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml
Added llm-cost policy (computes USD cost, injects x-llm-cost) and llm-cost-based-ratelimit (monetary budget windows, memory/redis options, enforcement via advanced-ratelimit).
Config Templates & Runtime Config
gateway/configs/config-template.toml, gateway/configs/config.toml, gateway/it/test-config.toml
Added [policy_configurations.llm_cost_v0] with pricing_file = "/home/wso2/conf/llm-pricing/model_prices.json".
Runtime image & Build targets
gateway/gateway-runtime/Dockerfile, gateway/gateway-runtime/Makefile
Include llm-pricing assets in Docker stages and copy target/configs/llm-pricing in Makefile build targets; ensure directory creation and ownership/permissions for /home/wso2.
Mock API surface & paths
gateway/it/mock-api/openapi.yaml, gateway/it/mock-api/paths/...
Added numerous mock endpoints and detailed path YAMLs for OpenAI, Anthropic, Gemini, Mistral, custom and unknown LLMs (request/response examples, usage metadata, caching and service_tier variants).
Integration Tests (features)
gateway/it/features/llm-cost.feature, gateway/it/features/llm-cost-combination.feature, gateway/it/features/llm-cost-based-ratelimit.feature
Added extensive BDD scenarios validating cost computation, caching, tiers, unknown models, budget exhaustion, multi-window budgets, resets, header assertions, and cleanup flows.
IT test suite & steps
gateway/it/steps_health.go, gateway/it/suite_test.go
Registered new feature files in the IT suite and added a test step I wait for policy snapshot sync.
IT mock path details
gateway/it/mock-api/paths/anthropic/*, .../gemini/*, .../openai/*, .../mistral/*, .../custom-llm/*, .../unknown-llm/*
Added many specific path definitions with examples and usage fields supporting the new tests (caching, geo, tooling, thinking tokens, etc.).
Minor code/formatting
gateway/gateway-controller/pkg/config/config.go, gateway/gateway-controller/pkg/policy/builder.go
Whitespace/formatting tweaks and added log/slog import; no functional changes.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Gateway as Gateway
    participant Provider as LLM Provider
    participant PolicyEngine as Policy Engine
    participant PricingDB as Pricing DB
    participant RateLimiter as Rate Limiter

    Client->>Gateway: Send LLM request
    Gateway->>Provider: Forward request
    Provider-->>Gateway: Response (model, usage)
    Gateway->>PolicyEngine: Run llm-cost policy (response)
    PolicyEngine->>PricingDB: Lookup model pricing
    PricingDB-->>PolicyEngine: Return rates
    PolicyEngine->>PolicyEngine: Compute USD cost
    PolicyEngine-->>Gateway: Inject `x-llm-cost` header
    Gateway->>PolicyEngine: Run llm-cost-based-ratelimit (pass cost)
    PolicyEngine->>RateLimiter: Check/deduct budget
    alt Budget available
      RateLimiter-->>PolicyEngine: Allow
      PolicyEngine-->>Gateway: Proceed
      Gateway-->>Client: 200 + cost headers
    else Budget exhausted
      RateLimiter-->>PolicyEngine: Reject
      PolicyEngine-->>Gateway: Deny (429)
      Gateway-->>Client: 429 Too Many Requests
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐇 I hopped through tokens, soft and light,
Prices twinkled in the moonlit night.
Budgets guarded by a careful paw,
Headers gleam with every law.
A rabbit counted costs with joyous delight.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Feat/llm cost user policy' directly references the main feature being added (llm cost policies) and aligns with the PR's primary objectives of introducing cost calculation and enforcement policies.
Description check ✅ Passed The PR description covers Purpose, Goals, Approach, Automation tests, and Security checks from the required template, providing comprehensive details about the LLM cost policy implementation. However, Documentation, Samples, Related PRs, and Test environment sections are either missing or incomplete.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
gateway/gateway-runtime/Makefile (1)

39-39: Deduplicate repeated target-context setup to reduce drift risk.

The same prep command is repeated in 4 targets. Consider extracting it into a reusable make target/macro so future path changes stay consistent.

♻️ Suggested refactor
+.PHONY: prepare-target-context
+
+prepare-target-context:
+	`@mkdir` -p target/configs
+	`@cp` ../build.yaml target/
+	`@cp` -r ../configs/llm-pricing target/configs/
+
 build: ## Build Gateway Runtime Docker image using buildx
 	`@echo` "Building Gateway Runtime Docker image ($(IMAGE_NAME):$(VERSION))..."
-	`@mkdir` -p target/configs && cp ../build.yaml target/ && cp -r ../configs/llm-pricing target/configs/
+	@$(MAKE) prepare-target-context
 	docker buildx build -f Dockerfile \
 ...
 build-debug: ## Build debug Gateway Runtime image for remote debugging with dlv (VS Code attach on port 2346)
 	`@echo` "Building Gateway Runtime debug image ($(IMAGE_NAME)-debug:$(VERSION))..."
-	`@mkdir` -p target/configs && cp ../build.yaml target/ && cp -r ../configs/llm-pricing target/configs/
+	@$(MAKE) prepare-target-context
 ...
 build-coverage-image: test ## Build Gateway Runtime Docker image with coverage instrumentation
 	`@echo` "Building Gateway Runtime coverage image ($(IMAGE_NAME)-coverage:$(VERSION))..."
-	`@mkdir` -p target/configs && cp ../build.yaml target/ && cp -r ../configs/llm-pricing target/configs/
+	@$(MAKE) prepare-target-context
 ...
 build-and-push-multiarch: ## Build and push Gateway Runtime Docker image for multiple architectures (amd64, arm64)
 	`@echo` "Building and pushing multi-arch Gateway Runtime Docker image ($(VERSION))..."
-	`@mkdir` -p target/configs && cp ../build.yaml target/ && cp -r ../configs/llm-pricing target/configs/
+	@$(MAKE) prepare-target-context

Also applies to: 62-62, 85-85, 109-109

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/Makefile` at line 39, The repeated prep command
(mkdir -p target/configs && cp ../build.yaml target/ && cp -r
../configs/llm-pricing target/configs/) should be extracted into a single
reusable Makefile target or macro (e.g., a .PHONY target named prepare-configs
or a make variable) and then referenced as a dependency or invoked from the four
targets that currently duplicate the line; update each target to depend on or
call prepare-configs instead of repeating the shell sequence, ensure the new
target performs the same mkdir/cp steps and preserves exit behavior so build
ordering and error handling remain unchanged.
gateway/it/mock-api/paths/unknown-llm/chat.yaml (1)

8-16: Make model and messages required in this endpoint to avoid scenario overlap.

Right now this payload can also represent “missing model” cases, which dilutes the specific unknown-model path this mock is intended to cover.

♻️ Proposed schema tightening
         schema:
           type: object
+          required:
+            - model
+            - messages
           properties:
             model:
               type: string
             messages:
               type: array
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/it/mock-api/paths/unknown-llm/chat.yaml` around lines 8 - 16, The
schema currently allows payloads missing fields; update the OpenAPI schema under
the object schema to require both properties by adding a required array that
lists "model" and "messages" so the endpoint always expects those fields (ensure
the existing properties for model and messages remain unchanged and that
"messages" stays typed as an array of objects).
gateway/it/features/llm-cost-combination.feature (1)

104-109: Assert the budget headers on rejected requests.

Both scenarios only check 429, so regressions in x-ratelimit-cost-limit-dollars and x-ratelimit-cost-remaining-dollars would still pass even though they are part of the rejection contract described for this policy pair.

Suggested coverage tweak
@@
     Then the response status code should be 429
+    And the response header "x-ratelimit-cost-limit-dollars" should exist
+    And the response header "x-ratelimit-cost-remaining-dollars" should exist
@@
     Then the response status code should be 429
+    And the response header "x-ratelimit-cost-limit-dollars" should exist
+    And the response header "x-ratelimit-cost-remaining-dollars" should exist

Also applies to: 186-191

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/it/features/llm-cost-combination.feature` around lines 104 - 109,
Update the feature assertions for the POST to
"http://localhost:8080/combo-openai/openai/v1/chat/completions" (and the other
scenario at lines 186-191) to also validate the rejection headers: assert the
response contains headers "x-ratelimit-cost-limit-dollars" and
"x-ratelimit-cost-remaining-dollars", verify the limit header is present and
numeric, and verify the remaining header is present and represents a value <= 0
(or equals "0" if your policy uses exact zero), so the 429 test covers the full
rejection contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml`:
- Around line 35-40: The duration pattern currently allows a leading + or -
which permits signed durations like "-1h"; update the regex for the duration
field in the llm-cost-based-ratelimit policy to disallow any sign prefix so only
non-negative Go duration strings are accepted (i.e. remove the leading "[-+]?"
from the pattern used by the duration property). Ensure the pattern still
matches valid Go duration units (ns, us, µs, ms, s, m, h) and the same numeric
formats used previously.
- Around line 11-16: The schema currently allows an empty params object because
budgetLimits is optional; update the parameters schema to make budgetLimits
mandatory by adding a required array that includes "budgetLimits" (i.e., add
required: ["budgetLimits"] alongside the existing
type/additionalProperties/properties) so a missing budgetLimits fails validation
rather than becoming a silent no-op; reference the parameters -> properties ->
budgetLimits entries when making this change.
- Around line 157-177: The Redis default-value bindings use camelCase keys
(keyPrefix, failureMode, connectionTimeout, readTimeout, writeTimeout) which
drift from the snake_case names used elsewhere (key_prefix, failure_mode,
connection_timeout, read_timeout, write_timeout) causing non-default config to
be ignored; update the property names in this policy so the "wso2/defaultValue"
mappings reference the snake_case identifiers (key_prefix, failure_mode,
connection_timeout, read_timeout, write_timeout) and keep the existing defaults
and enums (e.g., failureMode enum "open"/"closed") intact to restore correct
Redis config binding for distributed budget enforcement.

---

Nitpick comments:
In `@gateway/gateway-runtime/Makefile`:
- Line 39: The repeated prep command (mkdir -p target/configs && cp
../build.yaml target/ && cp -r ../configs/llm-pricing target/configs/) should be
extracted into a single reusable Makefile target or macro (e.g., a .PHONY target
named prepare-configs or a make variable) and then referenced as a dependency or
invoked from the four targets that currently duplicate the line; update each
target to depend on or call prepare-configs instead of repeating the shell
sequence, ensure the new target performs the same mkdir/cp steps and preserves
exit behavior so build ordering and error handling remain unchanged.

In `@gateway/it/features/llm-cost-combination.feature`:
- Around line 104-109: Update the feature assertions for the POST to
"http://localhost:8080/combo-openai/openai/v1/chat/completions" (and the other
scenario at lines 186-191) to also validate the rejection headers: assert the
response contains headers "x-ratelimit-cost-limit-dollars" and
"x-ratelimit-cost-remaining-dollars", verify the limit header is present and
numeric, and verify the remaining header is present and represents a value <= 0
(or equals "0" if your policy uses exact zero), so the 429 test covers the full
rejection contract.

In `@gateway/it/mock-api/paths/unknown-llm/chat.yaml`:
- Around line 8-16: The schema currently allows payloads missing fields; update
the OpenAPI schema under the object schema to require both properties by adding
a required array that lists "model" and "messages" so the endpoint always
expects those fields (ensure the existing properties for model and messages
remain unchanged and that "messages" stays typed as an array of objects).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 836c2554-009c-4196-b5ef-0b715dc89885

📥 Commits

Reviewing files that changed from the base of the PR and between 3435468 and 97fa17c.

⛔ Files ignored due to path filters (1)
  • go.work.sum is excluded by !**/*.sum
📒 Files selected for processing (38)
  • .gitignore
  • gateway/build-lock.yaml
  • gateway/build.yaml
  • gateway/configs/config-template.toml
  • gateway/configs/config.toml
  • gateway/configs/llm-pricing/model_prices.json
  • gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml
  • gateway/gateway-controller/default-policies/llm-cost.yaml
  • gateway/gateway-controller/pkg/config/config.go
  • gateway/gateway-controller/pkg/policy/builder.go
  • gateway/gateway-runtime/Dockerfile
  • gateway/gateway-runtime/Makefile
  • gateway/it/features/llm-cost-combination.feature
  • gateway/it/features/llm-cost.feature
  • gateway/it/mock-api/openapi.yaml
  • gateway/it/mock-api/paths/anthropic/messages.yaml
  • gateway/it/mock-api/paths/anthropic/messages_cache_1hr.yaml
  • gateway/it/mock-api/paths/anthropic/messages_cache_read.yaml
  • gateway/it/mock-api/paths/anthropic/messages_geo_speed.yaml
  • gateway/it/mock-api/paths/anthropic/messages_web_search.yaml
  • gateway/it/mock-api/paths/custom-llm/chat.yaml
  • gateway/it/mock-api/paths/custom-llm/gpt35.yaml
  • gateway/it/mock-api/paths/gemini/generate_content.yaml
  • gateway/it/mock-api/paths/gemini/generate_content_cached.yaml
  • gateway/it/mock-api/paths/gemini/generate_content_thinking.yaml
  • gateway/it/mock-api/paths/mistral/chat.yaml
  • gateway/it/mock-api/paths/openai/chat.yaml
  • gateway/it/mock-api/paths/openai/chat_batch.yaml
  • gateway/it/mock-api/paths/openai/chat_cached.yaml
  • gateway/it/mock-api/paths/openai/chat_flex.yaml
  • gateway/it/mock-api/paths/openai/chat_priority.yaml
  • gateway/it/mock-api/paths/openai/chat_reasoning.yaml
  • gateway/it/mock-api/paths/openai/chat_web_search.yaml
  • gateway/it/mock-api/paths/unknown-llm/chat.yaml
  • gateway/it/mock-api/paths/unknown-llm/no_model_field.yaml
  • gateway/it/steps_health.go
  • gateway/it/suite_test.go
  • gateway/it/test-config.toml
💤 Files with no reviewable changes (1)
  • gateway/it/mock-api/paths/openai/chat.yaml

Comment thread gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml Outdated
Comment thread gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml`:
- Around line 138-145: The memory config bindings in this policy use camelCase
keys maxEntries and cleanupInterval which drift from the `advanced-ratelimit`
config keys max_entries and cleanup_interval causing overrides to be ignored;
update the wso2/defaultValue bindings for the fields named maxEntries and
cleanupInterval to reference
`${config.policy_configurations.ratelimit_v0.memory.max_entries}` and
`${config.policy_configurations.ratelimit_v0.memory.cleanup_interval}`
respectively (leave the schema property names as-is if needed but ensure the
bound config keys match `max_entries` and `cleanup_interval`).
- Line 64: The YAML references
config.policy_configurations.llm_cost_ratelimit.cost_scale_factor but that
section/field doesn't exist; add a cost_scale_factor entry under
policy_configurations.llm_cost_v0 in both config.toml and config-template.toml
(provide a sensible default, e.g., cost_scale_factor = 1.0), and update the YAML
binding to use the correct path
config.policy_configurations.llm_cost_v0.cost_scale_factor (the
wso2/defaultValue key should reference
${config.policy_configurations.llm_cost_v0.cost_scale_factor}); ensure the key
name cost_scale_factor matches exactly between the TOML and the YAML.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: dd7ec1d8-b533-466b-90c5-4849b4ca4abb

📥 Commits

Reviewing files that changed from the base of the PR and between 97fa17c and 52c1880.

📒 Files selected for processing (1)
  • gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml

Comment thread gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml Outdated
Comment thread gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml Outdated
@Saadha123 Saadha123 force-pushed the Feat/llm-cost-user-policy branch from 52c1880 to 4897c21 Compare March 16, 2026 09:58
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml (2)

138-145: ⚠️ Potential issue | 🟠 Major

Memory config bindings drift from advanced-ratelimit.

The bindings use maxentries and cleanupinterval but the reference advanced-ratelimit.yaml uses max_entries and cleanup_interval. Non-default memory config will be ignored.

Suggested fix
       maxEntries:
         type: integer
         default: 10000
-        "wso2/defaultValue": "${config.policy_configurations.ratelimit_v0.memory.maxentries}"
+        "wso2/defaultValue": "${config.policy_configurations.ratelimit_v0.memory.max_entries}"
       cleanupInterval:
         type: string
         default: "5m"
-        "wso2/defaultValue": "${config.policy_configurations.ratelimit_v0.memory.cleanupinterval}"
+        "wso2/defaultValue": "${config.policy_configurations.ratelimit_v0.memory.cleanup_interval}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml`
around lines 138 - 145, The policy's memory config binds use the wrong
keys—change the wso2/defaultValue references for maxEntries and cleanupInterval
to match the advanced-ratelimit names: replace
"${config.policy_configurations.ratelimit_v0.memory.maxentries}" with
"${config.policy_configurations.ratelimit_v0.memory.max_entries}" and replace
"${config.policy_configurations.ratelimit_v0.memory.cleanupinterval}" with
"${config.policy_configurations.ratelimit_v0.memory.cleanup_interval}" so the
defaults for maxEntries and cleanupInterval in the schema (maxEntries,
cleanupInterval) correctly pick up non-default values from the config.

62-64: ⚠️ Potential issue | 🔴 Critical

Config section reference and field are both missing.

Line 64 references ${config.policy_configurations.llm_cost_ratelimit.cost_scale_factor}, but the config template only has [policy_configurations.llm_cost_v0] which contains only pricing_file. Neither the llm_cost_ratelimit section nor the cost_scale_factor field exists.

Suggested fix

Add cost_scale_factor to [policy_configurations.llm_cost_v0] in config-template.toml:

 [policy_configurations.llm_cost_v0]
 pricing_file = "/home/wso2/conf/llm-pricing/model_prices.json"
+cost_scale_factor = 1000000000

Then update the binding in this YAML:

-      "wso2/defaultValue": "${config.policy_configurations.llm_cost_ratelimit.cost_scale_factor}"
+      "wso2/defaultValue": "${config.policy_configurations.llm_cost_v0.cost_scale_factor}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml`
around lines 62 - 64, The policy YAML references
${config.policy_configurations.llm_cost_ratelimit.cost_scale_factor} but your
config template only defines [policy_configurations.llm_cost_v0] with
pricing_file; add the missing section/field: add cost_scale_factor under the
appropriate policy_configurations block in config-template.toml (or rename the
YAML binding to match the existing llm_cost_v0 keys). Specifically, either add a
new llm_cost_ratelimit table with a cost_scale_factor entry in
config-template.toml, or change the YAML's "wso2/defaultValue" to reference
policy_configurations.llm_cost_v0.pricing_file (or the correct existing key) so
the binding matches an existing config value.
🧹 Nitpick comments (5)
gateway/it/mock-api/paths/mistral/chat.yaml (1)

45-49: Unusual field prompt_audio_seconds for Mistral API response.

The prompt_audio_seconds: null field in the usage object is atypical for Mistral's API response format. If this is intentional for testing edge cases where the cost policy handles unexpected fields, consider adding a comment to clarify. Otherwise, it may be removed for accuracy.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/it/mock-api/paths/mistral/chat.yaml` around lines 45 - 49, The usage
object in the mock Mistral chat response contains an atypical field
prompt_audio_seconds: null; either remove this field to match Mistral's expected
response shape or explicitly document its presence as an intentional edge-case
for cost-policy tests (add an inline comment in paths/mistral/chat.yaml near the
usage object), and if it’s retained update any test or validation code that
consumes the usage object (e.g., mocks or parsers) to handle
prompt_audio_seconds being present and null.
gateway/it/mock-api/paths/openai/chat_flex.yaml (1)

8-19: Consider adding required constraints for consistency.

The chat_cached.yaml in the same directory specifies required: [model, messages] while this schema omits it. For consistency across OpenAI mock endpoints, consider adding the constraint.

Suggested fix
         schema:
           type: object
+          required:
+            - model
+            - messages
           properties:
             model:
               type: string
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/it/mock-api/paths/openai/chat_flex.yaml` around lines 8 - 19, Add a
required constraint to the top-level schema by adding required: - model -
messages so the schema object enforces presence of properties.model and
properties.messages; update the schema node (referencing schema,
properties.model, properties.messages) to match the other mock
(chat_cached.yaml) for consistency across OpenAI mock endpoints.
gateway/gateway-runtime/Dockerfile (1)

157-159: Debug stage lacks directory creation and ownership setup for llm-pricing.

The production stage (lines 203-205) explicitly creates /home/wso2/conf/llm-pricing and sets ownership. The debug stage copies to the same path but skips these steps. While the debug stage runs as root, consider adding consistency for maintainability:

Suggested fix for debug stage
 COPY --from=target configs/llm-pricing/ /home/wso2/conf/llm-pricing/
 
-RUN chmod +x /app/policy-engine /usr/local/bin/dlv /usr/local/bin/docker-entrypoint.sh /usr/local/bin/health-check.sh
+RUN mkdir -p /home/wso2/conf/llm-pricing && \
+    chmod +x /app/policy-engine /usr/local/bin/dlv /usr/local/bin/docker-entrypoint.sh /usr/local/bin/health-check.sh
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/Dockerfile` around lines 157 - 159, The debug stage
copies configs/llm-pricing into /home/wso2/conf/llm-pricing but doesn’t create
the directory or set ownership like the production stage; update the debug-stage
Dockerfile sequence around the COPY --from=target configs/llm-pricing/
/home/wso2/conf/llm-pricing/ and subsequent RUN chmod lines to also create
/home/wso2/conf/llm-pricing (mkdir -p) and chown it to the same user/group used
in production (e.g., chown -R wso2:wso2 /home/wso2/conf/llm-pricing) so
permissions and layout match production.
gateway/it/mock-api/paths/custom-llm/chat.yaml (1)

8-16: Consider adding required constraints for request body fields.

Unlike the Mistral mock (which specifies required: [model, messages]), this schema lacks required field constraints. For consistency across mock API definitions, consider adding them.

Suggested fix
         schema:
           type: object
+          required:
+            - model
+            - messages
           properties:
             model:
               type: string
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/it/mock-api/paths/custom-llm/chat.yaml` around lines 8 - 16, The
request body schema is missing required constraints for the model and messages
fields; update the schema for the chat operation by adding a required array that
lists "model" and "messages" (i.e., add required: [model, messages] under the
same schema object that contains properties: model and messages) so the mock
matches the Mistral definition and enforces both fields are present.
gateway/it/features/llm-cost-combination.feature (1)

193-201: Consider increasing the wait buffer for window reset test.

The 11-second wait after a 10-second window may be tight in slow CI environments, potentially causing flaky tests. Consider increasing to 12-13 seconds for more reliable test execution.

Suggested adjustment
-    When I wait for 11 seconds
+    When I wait for 12 seconds
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/it/features/llm-cost-combination.feature` around lines 193 - 201,
Increase the wait buffer in the window-reset scenario by changing the step that
currently says "When I wait for 11 seconds" to wait 12–13 seconds instead, so
the feature's scenario (the step before sending POST to
"http://localhost:8080/combo-reset/openai/v1/chat/completions") has a more
reliable delay for slow CI environments.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/it/mock-api/paths/custom-llm/chat.yaml`:
- Around line 25-39: The mock chat example uses model "my-enterprise-llm-v1" but
that model is not defined in the pricing DB, causing llm-cost policy tests to
fail; add an entry for "my-enterprise-llm-v1" in the pricing file
model_prices.json with input rate 1e-5 and output rate 2e-5 so the policy can
compute the expected x-llm-cost (0.0016000000). Locate the model_prices.json and
add the object keyed by "my-enterprise-llm-v1" with fields matching existing
entries (e.g., "input_per_token": 0.00001, "output_per_token": 0.00002) and
ensure it follows the same structure/format as other models so the lookup used
by the llm-cost policy succeeds.

---

Duplicate comments:
In `@gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml`:
- Around line 138-145: The policy's memory config binds use the wrong
keys—change the wso2/defaultValue references for maxEntries and cleanupInterval
to match the advanced-ratelimit names: replace
"${config.policy_configurations.ratelimit_v0.memory.maxentries}" with
"${config.policy_configurations.ratelimit_v0.memory.max_entries}" and replace
"${config.policy_configurations.ratelimit_v0.memory.cleanupinterval}" with
"${config.policy_configurations.ratelimit_v0.memory.cleanup_interval}" so the
defaults for maxEntries and cleanupInterval in the schema (maxEntries,
cleanupInterval) correctly pick up non-default values from the config.
- Around line 62-64: The policy YAML references
${config.policy_configurations.llm_cost_ratelimit.cost_scale_factor} but your
config template only defines [policy_configurations.llm_cost_v0] with
pricing_file; add the missing section/field: add cost_scale_factor under the
appropriate policy_configurations block in config-template.toml (or rename the
YAML binding to match the existing llm_cost_v0 keys). Specifically, either add a
new llm_cost_ratelimit table with a cost_scale_factor entry in
config-template.toml, or change the YAML's "wso2/defaultValue" to reference
policy_configurations.llm_cost_v0.pricing_file (or the correct existing key) so
the binding matches an existing config value.

---

Nitpick comments:
In `@gateway/gateway-runtime/Dockerfile`:
- Around line 157-159: The debug stage copies configs/llm-pricing into
/home/wso2/conf/llm-pricing but doesn’t create the directory or set ownership
like the production stage; update the debug-stage Dockerfile sequence around the
COPY --from=target configs/llm-pricing/ /home/wso2/conf/llm-pricing/ and
subsequent RUN chmod lines to also create /home/wso2/conf/llm-pricing (mkdir -p)
and chown it to the same user/group used in production (e.g., chown -R wso2:wso2
/home/wso2/conf/llm-pricing) so permissions and layout match production.

In `@gateway/it/features/llm-cost-combination.feature`:
- Around line 193-201: Increase the wait buffer in the window-reset scenario by
changing the step that currently says "When I wait for 11 seconds" to wait 12–13
seconds instead, so the feature's scenario (the step before sending POST to
"http://localhost:8080/combo-reset/openai/v1/chat/completions") has a more
reliable delay for slow CI environments.

In `@gateway/it/mock-api/paths/custom-llm/chat.yaml`:
- Around line 8-16: The request body schema is missing required constraints for
the model and messages fields; update the schema for the chat operation by
adding a required array that lists "model" and "messages" (i.e., add required:
[model, messages] under the same schema object that contains properties: model
and messages) so the mock matches the Mistral definition and enforces both
fields are present.

In `@gateway/it/mock-api/paths/mistral/chat.yaml`:
- Around line 45-49: The usage object in the mock Mistral chat response contains
an atypical field prompt_audio_seconds: null; either remove this field to match
Mistral's expected response shape or explicitly document its presence as an
intentional edge-case for cost-policy tests (add an inline comment in
paths/mistral/chat.yaml near the usage object), and if it’s retained update any
test or validation code that consumes the usage object (e.g., mocks or parsers)
to handle prompt_audio_seconds being present and null.

In `@gateway/it/mock-api/paths/openai/chat_flex.yaml`:
- Around line 8-19: Add a required constraint to the top-level schema by adding
required: - model - messages so the schema object enforces presence of
properties.model and properties.messages; update the schema node (referencing
schema, properties.model, properties.messages) to match the other mock
(chat_cached.yaml) for consistency across OpenAI mock endpoints.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bfb38e64-99fd-488d-9192-5a90fcccd832

📥 Commits

Reviewing files that changed from the base of the PR and between 52c1880 and 4897c21.

⛔ Files ignored due to path filters (1)
  • go.work.sum is excluded by !**/*.sum
📒 Files selected for processing (38)
  • .gitignore
  • gateway/build-lock.yaml
  • gateway/build.yaml
  • gateway/configs/config-template.toml
  • gateway/configs/config.toml
  • gateway/configs/llm-pricing/model_prices.json
  • gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml
  • gateway/gateway-controller/default-policies/llm-cost.yaml
  • gateway/gateway-controller/pkg/config/config.go
  • gateway/gateway-controller/pkg/policy/builder.go
  • gateway/gateway-runtime/Dockerfile
  • gateway/gateway-runtime/Makefile
  • gateway/it/features/llm-cost-combination.feature
  • gateway/it/features/llm-cost.feature
  • gateway/it/mock-api/openapi.yaml
  • gateway/it/mock-api/paths/anthropic/messages.yaml
  • gateway/it/mock-api/paths/anthropic/messages_cache_1hr.yaml
  • gateway/it/mock-api/paths/anthropic/messages_cache_read.yaml
  • gateway/it/mock-api/paths/anthropic/messages_geo_speed.yaml
  • gateway/it/mock-api/paths/anthropic/messages_web_search.yaml
  • gateway/it/mock-api/paths/custom-llm/chat.yaml
  • gateway/it/mock-api/paths/custom-llm/gpt35.yaml
  • gateway/it/mock-api/paths/gemini/generate_content.yaml
  • gateway/it/mock-api/paths/gemini/generate_content_cached.yaml
  • gateway/it/mock-api/paths/gemini/generate_content_thinking.yaml
  • gateway/it/mock-api/paths/mistral/chat.yaml
  • gateway/it/mock-api/paths/openai/chat.yaml
  • gateway/it/mock-api/paths/openai/chat_batch.yaml
  • gateway/it/mock-api/paths/openai/chat_cached.yaml
  • gateway/it/mock-api/paths/openai/chat_flex.yaml
  • gateway/it/mock-api/paths/openai/chat_priority.yaml
  • gateway/it/mock-api/paths/openai/chat_reasoning.yaml
  • gateway/it/mock-api/paths/openai/chat_web_search.yaml
  • gateway/it/mock-api/paths/unknown-llm/chat.yaml
  • gateway/it/mock-api/paths/unknown-llm/no_model_field.yaml
  • gateway/it/steps_health.go
  • gateway/it/suite_test.go
  • gateway/it/test-config.toml
💤 Files with no reviewable changes (1)
  • gateway/it/mock-api/paths/openai/chat.yaml
🚧 Files skipped from review as they are similar to previous changes (19)
  • gateway/gateway-runtime/Makefile
  • gateway/build.yaml
  • gateway/it/mock-api/paths/unknown-llm/chat.yaml
  • gateway/it/steps_health.go
  • gateway/it/mock-api/paths/custom-llm/gpt35.yaml
  • gateway/it/mock-api/paths/gemini/generate_content.yaml
  • gateway/gateway-controller/default-policies/llm-cost.yaml
  • gateway/it/mock-api/paths/openai/chat_priority.yaml
  • gateway/build-lock.yaml
  • gateway/it/mock-api/paths/anthropic/messages.yaml
  • gateway/gateway-controller/pkg/policy/builder.go
  • gateway/gateway-controller/pkg/config/config.go
  • gateway/it/mock-api/paths/anthropic/messages_cache_read.yaml
  • gateway/it/mock-api/paths/anthropic/messages_geo_speed.yaml
  • gateway/it/test-config.toml
  • gateway/configs/config.toml
  • gateway/it/suite_test.go
  • gateway/it/mock-api/paths/openai/chat_web_search.yaml
  • gateway/it/mock-api/paths/gemini/generate_content_cached.yaml

Comment on lines +25 to +39
example:
id: resp-custom-001
object: chat.completion
created: 1741569952
model: my-enterprise-llm-v1
choices:
- index: 0
message:
role: assistant
content: "Hello from the custom enterprise LLM!"
finish_reason: stop
usage:
prompt_tokens: 80
completion_tokens: 40
total_tokens: 120
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the custom model exists in the pricing database with expected rates
fd -t f 'model_prices.json' --exec grep -A 10 'my-enterprise-llm-v1' {} \;

Repository: wso2/api-platform

Length of output: 43


🏁 Script executed:

# Search for model_prices.json or similar pricing files
fd -t f 'model_prices'

Repository: wso2/api-platform

Length of output: 106


🏁 Script executed:

# Also search broadly for any pricing-related files
fd -t f 'price'

Repository: wso2/api-platform

Length of output: 106


🏁 Script executed:

# Check the structure of the gateway/it directory
find gateway/it -type f -name "*.json" | head -20

Repository: wso2/api-platform

Length of output: 43


🏁 Script executed:

cat gateway/configs/llm-pricing/model_prices.json

Repository: wso2/api-platform

Length of output: 50374


🏁 Script executed:

grep -A 5 'my-enterprise-llm-v1' gateway/configs/llm-pricing/model_prices.json

Repository: wso2/api-platform

Length of output: 43


Add my-enterprise-llm-v1 to the pricing database.

The mock response uses a model name that does not exist in gateway/configs/llm-pricing/model_prices.json. The llm-cost policy tests will fail when attempting to look up pricing for this model. Define the model with the expected pricing rates (input=1e-5/token, output=2e-5/token) to produce the expected x-llm-cost header value of 0.0016000000.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/it/mock-api/paths/custom-llm/chat.yaml` around lines 25 - 39, The
mock chat example uses model "my-enterprise-llm-v1" but that model is not
defined in the pricing DB, causing llm-cost policy tests to fail; add an entry
for "my-enterprise-llm-v1" in the pricing file model_prices.json with input rate
1e-5 and output rate 2e-5 so the policy can compute the expected x-llm-cost
(0.0016000000). Locate the model_prices.json and add the object keyed by
"my-enterprise-llm-v1" with fields matching existing entries (e.g.,
"input_per_token": 0.00001, "output_per_token": 0.00002) and ensure it follows
the same structure/format as other models so the lookup used by the llm-cost
policy succeeds.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
gateway/it/features/llm-cost-based-ratelimit.feature (2)

85-92: Consider asserting header values for more robust verification.

Lines 91-92 only check that rate-limit headers exist. Verifying the actual values (e.g., x-ratelimit-cost-limit-dollars should be "0.000236" and x-ratelimit-cost-remaining-dollars should be approximately "0.000118" after the first request) would catch subtle bugs in cost calculation or header formatting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/it/features/llm-cost-based-ratelimit.feature` around lines 85 - 92,
Update the scenario assertions to verify the actual rate-limit header values
rather than only existence: check that the "x-ratelimit-cost-limit-dollars"
header equals "0.000236" and that "x-ratelimit-cost-remaining-dollars" equals or
approximately equals "0.000118" after the first request (use exact string match
for the limit header and a numeric/regex or tolerance-based comparison for the
remaining header to allow minor floating-point differences); update the step(s)
that currently assert header existence to perform these value checks against the
headers "x-ratelimit-cost-limit-dollars" and
"x-ratelimit-cost-remaining-dollars" so the test catches
cost-calculation/formatting regressions.

675-676: Potential test flakiness due to tight timing buffer.

The 11-second wait for a 10-second window provides only a 1-second buffer. In CI environments with variable timing (container scheduling, resource contention), this could cause intermittent failures. Consider increasing the wait to 12-15 seconds, or alternatively using a shorter window (e.g., 5s) with proportionally longer wait to increase the relative buffer.

🔧 Suggested adjustment
                 params:
                   budgetLimits:
                     - amount: 0.000236
-                      duration: "10s"
+                      duration: "5s"
          - name: llm-cost

Then update the wait:

-    When I wait for 11 seconds
+    When I wait for 8 seconds

This gives a 60% buffer (3s on 5s) instead of 10% (1s on 10s).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/it/features/llm-cost-based-ratelimit.feature` around lines 675 - 676,
The test's timing is fragile: replace the step "When I wait for 11 seconds"
(which only gives a 1s buffer on a 10-second window) with a longer wait to avoid
CI flakiness—e.g., change it to "When I wait for 15 seconds" (or any 12–15s
value) so the wait provides a safer buffer, or alternatively shorten the
rate-limit window used by the scenario (e.g., switch the 10s window to 5s and
wait ~8s) and update the corresponding step accordingly to ensure a larger
relative buffer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@gateway/it/features/llm-cost-based-ratelimit.feature`:
- Around line 85-92: Update the scenario assertions to verify the actual
rate-limit header values rather than only existence: check that the
"x-ratelimit-cost-limit-dollars" header equals "0.000236" and that
"x-ratelimit-cost-remaining-dollars" equals or approximately equals "0.000118"
after the first request (use exact string match for the limit header and a
numeric/regex or tolerance-based comparison for the remaining header to allow
minor floating-point differences); update the step(s) that currently assert
header existence to perform these value checks against the headers
"x-ratelimit-cost-limit-dollars" and "x-ratelimit-cost-remaining-dollars" so the
test catches cost-calculation/formatting regressions.
- Around line 675-676: The test's timing is fragile: replace the step "When I
wait for 11 seconds" (which only gives a 1s buffer on a 10-second window) with a
longer wait to avoid CI flakiness—e.g., change it to "When I wait for 15
seconds" (or any 12–15s value) so the wait provides a safer buffer, or
alternatively shorten the rate-limit window used by the scenario (e.g., switch
the 10s window to 5s and wait ~8s) and update the corresponding step accordingly
to ensure a larger relative buffer.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f2fbd480-d885-4e29-a0df-6777b79dfe64

📥 Commits

Reviewing files that changed from the base of the PR and between 4897c21 and 0f70bd2.

📒 Files selected for processing (2)
  • gateway/it/features/llm-cost-based-ratelimit.feature
  • gateway/it/suite_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • gateway/it/suite_test.go

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml (2)

133-145: Consider adding range constraints on maxEntries for consistency.

The memory config bindings are now correct. For parity with advanced-ratelimit.yaml, consider adding:

  • maxEntries: minimum: 100, maximum: 10000000

This is a minor consistency improvement.

♻️ Suggested constraint
       maxEntries:
         type: integer
         default: 10000
+        minimum: 100
+        maximum: 10000000
         "wso2/defaultValue": "${config.policy_configurations.ratelimit_v0.memory.max_entries}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml`
around lines 133 - 145, Add numeric range constraints to the memory.maxEntries
schema by adding "minimum: 100" and "maximum: 10000000" under the maxEntries
property so it matches the parity in advanced-ratelimit.yaml; update the
maxEntries entry (within the memory properties block) to include these
constraints alongside the existing type/default and wso2/defaultValue.

88-131: Consider adding range constraints for consistency with advanced-ratelimit.

The Redis config section works correctly, and the binding names now match the reference policy. For schema consistency, advanced-ratelimit.yaml defines:

  • port: minimum: 1, maximum: 65535
  • db: minimum: 0, maximum: 15

Adding these constraints would catch invalid values at validation time rather than runtime.

♻️ Suggested constraints
       port:
         type: integer
         default: 6379
+        minimum: 1
+        maximum: 65535
         "wso2/defaultValue": "${config.policy_configurations.ratelimit_v0.redis.port}"
       ...
       db:
         type: integer
         default: 0
+        minimum: 0
+        maximum: 15
         "wso2/defaultValue": "${config.policy_configurations.ratelimit_v0.redis.db}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml`
around lines 88 - 131, The Redis schema is missing range constraints on the port
and db properties; update the redis.port property to include minimum: 1 and
maximum: 65535 and update redis.db to include minimum: 0 and maximum: 15
(keeping their type: integer), mirroring the constraints used in
advanced-ratelimit to enforce validation at schema time and prevent invalid
values at runtime.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml`:
- Around line 62-64: The policy references
${config.policy_configurations.llm_cost_ratelimit_v0.cost_scale_factor} but that
section doesn't exist, so either add a new config section named
llm_cost_ratelimit_v0 with a cost_scale_factor key to both config.toml and
config-template.toml, or change the policy binding in
llm-cost-based-ratelimit.yaml to use the existing
policy_configurations.llm_cost_v0.cost_scale_factor instead (update the
"wso2/defaultValue" entry); ensure the chosen symbol (llm_cost_ratelimit_v0 or
llm_cost_v0 and cost_scale_factor) exists in both config files so the binding
overrides the schema default.

---

Nitpick comments:
In `@gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml`:
- Around line 133-145: Add numeric range constraints to the memory.maxEntries
schema by adding "minimum: 100" and "maximum: 10000000" under the maxEntries
property so it matches the parity in advanced-ratelimit.yaml; update the
maxEntries entry (within the memory properties block) to include these
constraints alongside the existing type/default and wso2/defaultValue.
- Around line 88-131: The Redis schema is missing range constraints on the port
and db properties; update the redis.port property to include minimum: 1 and
maximum: 65535 and update redis.db to include minimum: 0 and maximum: 15
(keeping their type: integer), mirroring the constraints used in
advanced-ratelimit to enforce validation at schema time and prevent invalid
values at runtime.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 19b789cd-668b-4818-93f2-4edb8868b8d5

📥 Commits

Reviewing files that changed from the base of the PR and between 0f70bd2 and a62981f.

📒 Files selected for processing (1)
  • gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml

Comment thread gateway/configs/config.toml Outdated
@Saadha123 Saadha123 force-pushed the Feat/llm-cost-user-policy branch from 5e8e877 to 4ae9963 Compare March 19, 2026 05:24
@Saadha123 Saadha123 force-pushed the Feat/llm-cost-user-policy branch from 4ae9963 to 1142f1e Compare March 19, 2026 07:58
@Krishanx92 Krishanx92 merged commit be98e0c into wso2:main Mar 19, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants