fix: per-model(s) top-level values in usage dashboard by ahadas · Pull Request #772 · opendatahub-io/models-as-a-service

ahadas · 2026-04-19T20:06:46Z

Description

The top-levels values of Total Requests, Total Errors, Success Rate, and Active Users, are now calculated also based on the selected model(s).

How Has This Been Tested?

Tested manually

Merge criteria:

The commits are squashed in a cohesive manner and have meaningful messages.
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

Chores
- Improved Usage dashboard to be model-aware and aggregated per user/subscription/namespace, enhancing accuracy for Active Users, Success Rate, Token Consumption, Total Errors, and Total Requests.
- Updated gating and fallback logic for authorization and rate-limit signals to produce more reliable time-series and per-user token consumption reporting.

coderabbitai · 2026-04-19T20:07:00Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

This change rewrites Prometheus queries in the Usage dashboard: Active Users now counts distinct users from authorized_hits{model=~"$model"}; Success Rate is recomputed using per-(user,subscription,limitador_namespace) gated sums with a boolean multiplier from max_over_time(authorized_hits{model=~"$model"}[$__range]) applied to both authorized and limited components; Token Consumption by user aggregates authorized_calls and limited_calls per key (with an or fallback to authorized-only); Total Errors and Total Requests are changed to per-namespace gated sums using the same authorized_hits-based gating and or vector(0) fallbacks.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Issues

High-cardinality resource risk (actionable): Grouping by (user, subscription, limitador_namespace) and gating with max_over_time(...{model=~"$model"}) increases series cardinality and can exhaust Prometheus/Thanos CPU/memory. Cite: CWE-400 (Uncontrolled Resource Consumption). Mitigation: measure cardinality, limit label cardinality, use topk/limits, or pre-aggregate at ingest/scrape time.
Input validation / injection risk for dashboard variables (actionable): Using regex variables like $model and $user directly in selectors enables users to craft expensive or high-cardinality queries. Cite: CWE-20 (Improper Input Validation) and CWE-400. Mitigation: restrict variable options to enumerated lists, sanitize/validate inputs, avoid free-text regex in shared dashboards.
Query join semantics correctness (actionable): Gating via max_over_time(...) > 0 with or vector(...) fallbacks and mixed authorized/limited aggregation can mask missing-series behavior or produce misleading defaults (e.g., or vector(1) affecting success rate). Mitigation: validate results across representative ranges, add tests/comparisons, document fallback semantics, and consider explicit zero/NaN handling to avoid accidental defaults.
Observable accuracy under sparse or dropped labels (actionable): Changing from global sums to per-key gated sums and adding limited_calls in aggregations can alter metric semantics when series are sparse or labels drop. Mitigation: run side-by-side reconciliation with legacy queries, add explanatory notes in the dashboard, and consider preserving legacy queries for accounting.

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: updating usage dashboard to calculate top-level values per-model(s), which matches the file changes and PR objectives.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 308-329: The current join uses "* on(user, subscription,
limitador_namespace) group_left(model)" which replicates left-side aggregates
per model and inflates totals; update the expression so the right-side
(authorized_hits / limited_hits) is reduced to the same join key before
multiplying (e.g. replace the group_left(model) join with a 1:1 multiplication
where the right operand is aggregated with max by (user, subscription,
limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range]))
or the equivalent for limited_calls), and apply the identical change for
totalErrors (limited_calls), successRate occurrences and activeUsers spots
called out in the comment to ensure no per-model duplication.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: 44fd8fcb-f184-4259-89ba-778f35a752be

📥 Commits

Reviewing files that changed from the base of the PR and between c01dc5b and ddd3414.

📒 Files selected for processing (1)

deployment/components/observability/observability/dashboards/usage-dashboard.yaml

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

deployment/components/observability/observability/dashboards/usage-dashboard.yaml (2)
228-229: ⚠️ Potential issue | 🟡 Minor

group_left(model) still present in tokenConsumptionByUser — intentional, but worth double-checking.

The PR drops group_left(model) from the four top-level panels, which correctly resolves the prior inflation bug. It is retained here (lines 228–229 and 246–247) because this table legitimately needs the model label fanned onto the call aggregates for per-model rows. That is the correct shape for this panel, but it means the same caveat as above applies: a user's call total is replicated to every model they touched in-range, so summing the Requests/Rate-Limited columns in this table across models will not equal the Total Requests/Total Errors stat panels. Confirm this divergence is acceptable UX, or add a note to the panel description.

Also applies to: 246-247
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`
around lines 228 - 229, The tokenConsumptionByUser table intentionally keeps
group_left(model) (see the expression using "group_left(model)" and the metric
authorized_hits) which fans the user's total onto each model row, causing
per-model rows to replicate a user's total so summing Requests/Rate-Limited
across models won't match the overall Total Requests/Total Errors panels; update
the tokenConsumptionByUser panel description (or add a short note in its panel
metadata) to explicitly state this behavior and warn users that per-model row
sums will not equal the top-level totals.
84-146: ⚠️ Potential issue | 🟠 Major

Per-model filtering attributes full call counts despite unlabeled metrics.

authorized_calls and limited_calls carry no model label (confirmed in deployment/components/observability/grafana/dashboard-platform-admin.yaml:627). The join at lines 90–91, 125–126, 136–137, 145–146 multiplies by (0 * max_over_time(authorized_hits{model=~"$model"}) + 1), which is a presence check: if the tuple had any hits for the selected model, the full count is included; otherwise zero.

Consequence: A user with 80 calls under model A and 20 under model B (total 100) will contribute 100 to totals when $model=A is selected, not 80. Sums across individual models exceed the "all models" total.

Either document this limitation in panel descriptions, or compute proportional attribution—multiply the tuple's call total by sum by(user,subscription,limitador_namespace)(authorized_hits{model=~"$model"}) / sum by(user,subscription,limitador_namespace)(authorized_hits) so reported totals reflect actual model distribution.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`
around lines 84 - 146, The panels (e.g., Success Rate and the user count query)
currently use the presence-check multiplier "(0 *
max_over_time(authorized_hits{model=~\"$model\"}[$__range]) + 1)" which causes
full call counts from authorized_calls/limited_calls to be included when any hit
exists for the model; replace each occurrence of that presence-check (the exact
token appears in the Prometheus queries) with a proportional attribution
expression that multiplies call totals by the model share, e.g. replace with
"sum
by(user,subscription,limitador_namespace)(authorized_hits{model=~\"$model\"}[$__range])
/ sum by(user,subscription,limitador_namespace)(authorized_hits[$__range])"
(apply the same pattern for limited_calls where the multiplier is used) and
ensure you guard against divide-by-zero by coalescing the denominator (e.g., add
a small zero-default or wrap with or vector(0)) so totals reflect actual
per-model distribution rather than an existence check.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 275-283: The current inner-join multiplier uses 0 * max by (user,
subscription, limitador_namespace)
(max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1 which drops
(user, subscription, limitador_namespace) tuples that have only limited_calls
and no authorized_hits; update that expression so the tuple selector is derived
from the union of authorized_hits and limited_calls series (e.g. replace the
max_over_time(authorized_hits{model=~"$model"}[$__range]) call with an OR/union
of max_over_time(authorized_hits{model=~"$model"}[$__range]) or
max_over_time(limited_calls{user!="", user=~"$user",
subscription=~"$subscription"}[$__range]) or use
max(max_over_time(...authorized_hits...), max_over_time(...limited_calls...)) so
rate-limited-only users remain in the join and their errors/success-rate
denominator are preserved).

---

Outside diff comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 228-229: The tokenConsumptionByUser table intentionally keeps
group_left(model) (see the expression using "group_left(model)" and the metric
authorized_hits) which fans the user's total onto each model row, causing
per-model rows to replicate a user's total so summing Requests/Rate-Limited
across models won't match the overall Total Requests/Total Errors panels; update
the tokenConsumptionByUser panel description (or add a short note in its panel
metadata) to explicitly state this behavior and warn users that per-model row
sums will not equal the top-level totals.
- Around line 84-146: The panels (e.g., Success Rate and the user count query)
currently use the presence-check multiplier "(0 *
max_over_time(authorized_hits{model=~\"$model\"}[$__range]) + 1)" which causes
full call counts from authorized_calls/limited_calls to be included when any hit
exists for the model; replace each occurrence of that presence-check (the exact
token appears in the Prometheus queries) with a proportional attribution
expression that multiplies call totals by the model share, e.g. replace with
"sum
by(user,subscription,limitador_namespace)(authorized_hits{model=~\"$model\"}[$__range])
/ sum by(user,subscription,limitador_namespace)(authorized_hits[$__range])"
(apply the same pattern for limited_calls where the multiplier is used) and
ensure you guard against divide-by-zero by coalescing the denominator (e.g., add
a small zero-default or wrap with or vector(0)) so totals reflect actual
per-model distribution rather than an existence check.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: d7fee6df-f04b-44bb-9b93-43af4a68e8ec

📥 Commits

Reviewing files that changed from the base of the PR and between ddd3414 and 3ee4658.

📒 Files selected for processing (1)

deployment/components/observability/observability/dashboards/usage-dashboard.yaml

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

deployment/components/observability/observability/dashboards/usage-dashboard.yaml (1)

132-172: ⚠️ Potential issue | 🟠 Major

Prevent limited-only traffic from defaulting to 100% success.

When the authorized side is empty, the denominator addition becomes empty and the final or vector(1) reports success as 1. Wrap the authorized sums with or vector(0) before the division/addition.

Proposed fix

-                        sum(
-                          sum by (user, subscription, limitador_namespace) (
-                            increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
-                          )
-                          * on(user, subscription, limitador_namespace)
-                          (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
-                        )
+                        (
+                          sum(
+                            sum by (user, subscription, limitador_namespace) (
+                              increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
+                            )
+                            * on(user, subscription, limitador_namespace)
+                            (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
+                          )
+                          or vector(0)
+                        )
@@
-                          sum(
-                            sum by (user, subscription, limitador_namespace) (
-                              increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
-                            )
-                            * on(user, subscription, limitador_namespace)
-                            (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
-                          )
+                          (
+                            sum(
+                              sum by (user, subscription, limitador_namespace) (
+                                increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
+                              )
+                              * on(user, subscription, limitador_namespace)
+                              (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
+                            )
+                            or vector(0)
+                          )

Verify with a selected model/range where limited_calls > 0 and authorized_calls is absent; expected success rate is 0, not 1.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`
around lines 132 - 172, The current PromQL can return 1 when authorized_calls
are absent because the authorized sums evaluate to empty; to fix this, wrap the
authorized-side sum expressions (the sum by (...)
(increase(authorized_calls{...}[$__range])) blocks and any compound sum that
uses authorized_hits) with "or vector(0)" so they yield 0 instead of empty
before the division/addition; update the expressions referencing
authorized_calls and the authorized_hits-derived max_over_time aggregates (used
with on(user, subscription, limitador_namespace)) to apply "or vector(0)" and
ensure the denominator calculation includes those zero values so a case with
only limited_calls returns 0 success rather than defaulting to 1.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 87-95: The expression that adds summed authorized_calls and
limited_calls loses labelsets because PromQL + does an inner-join; update the
two summed subexpressions (the sum by (user, subscription, limitador_namespace)
over increase(authorized_calls...) and the sum by (...) over
increase(limited_calls...)) to zero-fill missing series by appending "or
vector(0)" to each sum so unmatched users are preserved before the addition;
apply the same zero-fill pattern to the similar tokenConsumption aggregation
noted in the review and keep totalRequests as-is since it already handles or
vector(0).

---

Duplicate comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 132-172: The current PromQL can return 1 when authorized_calls are
absent because the authorized sums evaluate to empty; to fix this, wrap the
authorized-side sum expressions (the sum by (...)
(increase(authorized_calls{...}[$__range])) blocks and any compound sum that
uses authorized_hits) with "or vector(0)" so they yield 0 instead of empty
before the division/addition; update the expressions referencing
authorized_calls and the authorized_hits-derived max_over_time aggregates (used
with on(user, subscription, limitador_namespace)) to apply "or vector(0)" and
ensure the denominator calculation includes those zero values so a case with
only limited_calls returns 0 success rather than defaulting to 1.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: 3f955c0b-9d46-4fa0-a7ad-88da58115c44

📥 Commits

Reviewing files that changed from the base of the PR and between 3ee4658 and 91eac54.

📒 Files selected for processing (1)

deployment/components/observability/observability/dashboards/usage-dashboard.yaml

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

deployment/components/observability/observability/dashboards/usage-dashboard.yaml (2)

141-182: ⚠️ Potential issue | 🟠 Major

Zero-fill the authorized side before the success-rate division.

For rate-limited-only traffic, sum(authorized_calls...) is empty; PromQL vector arithmetic then drops the limited-only denominator, and the trailing or vector(1) reports 100% instead of 0%.

Proposed fix

                   query: |-
                     (
                       (
-                        sum(
-                          sum by (user, subscription, limitador_namespace) (
-                            increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
+                        (
+                          sum(
+                            sum by (user, subscription, limitador_namespace) (
+                              increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
+                            )
+                            * on(user, subscription, limitador_namespace)
+                            (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
                           )
-                          * on(user, subscription, limitador_namespace)
-                          (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
+                          or vector(0)
                         )
                       )
                       /
                       (
                         (
-                          sum(
-                            sum by (user, subscription, limitador_namespace) (
-                              increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
+                          (
+                            sum(
+                              sum by (user, subscription, limitador_namespace) (
+                                increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
+                              )
+                              * on(user, subscription, limitador_namespace)
+                              (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
                             )
-                            * on(user, subscription, limitador_namespace)
-                            (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
+                            or vector(0)
                           )
                           +
                           (

Verify with a selected model/user that has limited_calls > 0 and authorized_calls == 0; expected success rate is 0, not 1.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`
around lines 141 - 182, The success-rate query returns 1 for limited-only
traffic because the left-hand sum of authorized_calls is absent and PromQL drops
that vector during division; update the query to zero-fill the authorized side
by adding an explicit fallback (e.g., apply "or vector(0)" to the aggregate of
sum by (user, subscription, limitador_namespace)
(increase(authorized_calls...[$__range])) or otherwise wrap that
authorized_calls sum with an "or vector(0)" before the multiplication and
division so users/models with limited_calls>0 but authorized_calls==0 yield a 0%
success rate instead of 100%; adjust the corresponding expressions that
reference authorized_calls (and their multiplied form) to use this zero-filled
result.

257-303: ⚠️ Potential issue | 🟠 Major

Fix the table joins before multi-model tuples break or misattribute rows.

The LHS is one series per (user, subscription, limitador_namespace), while the RHS can be multiple series per same key due to model. PromQL group_left requires the left vector to have higher cardinality, so multi-model RHS matches can fail; switching to group_right would still copy the same model-less call count to every model. Use a request/error metric that carries model, or aggregate these columns without model. See Prometheus vector matching docs: https://prometheus.io/docs/prometheus/latest/querying/operators/#many-to-one-and-one-to-many-vector-matches

Read-only verification:

#!/bin/bash
# Confirm whether the call counters actually carry a model label in instrumentation/usages.
# Expected safe result: authorized_calls and limited_calls are emitted with a model label,
# or this table should not present them as per-model values.
rg -n -C4 '\b(authorized_calls|limited_calls|authorized_hits)\b'

Runtime PromQL check:

count by (user, subscription, limitador_namespace) (
  max by (user, subscription, limitador_namespace, model) (
    max_over_time(authorized_hits{model=~"$model"}[$__range])
  )
) > 1

Any returned series means this table join is unsafe for that selected model set.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`
around lines 257 - 303, The joins are unsafe because
authorized_calls/limited_calls are grouped by (user, subscription,
limitador_namespace) while the RHS uses model, causing many-to-one mismatches
with group_left/group_right; fix by either (A) using a request/error metric that
includes the model label (replace usages of
authorized_calls/limited_calls/authorized_hits in the expressions with the
model-bearing metric) so vector matching aligns on model, or (B) remove model
from the RHS aggregation (aggregate to sum by (user, subscription,
limitador_namespace) or sum by (user, subscription, model) consistently) and
then use a safe one-to-one match (keep group_left only when left has higher
cardinality) so authorized_calls, limited_calls, authorized_hits and the
increase(...) expressions share the same label set for matching.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 108-112: The queries are incorrectly filtering by model because
they rely on tuple existence instead of joining the model label from
authorized_hits; update the affected panels to join the model label using a
group_left(model) pattern so that authorized_calls/limited_calls are multiplied
only for the selected model. Specifically, for metrics authorized_calls and
limited_calls, replace the current tuple-existence expression with a join like:
sum by (user,subscription,limitador_namespace)(increase(authorized_calls{...}))
* on(user,subscription,limitador_namespace) group_left(model) (0 * max by
(user,subscription,limitador_namespace,model)(max_over_time(authorized_hits{model=~"$model"}[$__range]))
+ 1) (same for limited_calls using limited_calls and limited_calls metric
names), and apply this change to the panels referenced (the blocks around the
shown diff and the other panels noted) so the model filter is enforced via
group_left(model).

---

Duplicate comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 141-182: The success-rate query returns 1 for limited-only traffic
because the left-hand sum of authorized_calls is absent and PromQL drops that
vector during division; update the query to zero-fill the authorized side by
adding an explicit fallback (e.g., apply "or vector(0)" to the aggregate of sum
by (user, subscription, limitador_namespace)
(increase(authorized_calls...[$__range])) or otherwise wrap that
authorized_calls sum with an "or vector(0)" before the multiplication and
division so users/models with limited_calls>0 but authorized_calls==0 yield a 0%
success rate instead of 100%; adjust the corresponding expressions that
reference authorized_calls (and their multiplied form) to use this zero-filled
result.
- Around line 257-303: The joins are unsafe because
authorized_calls/limited_calls are grouped by (user, subscription,
limitador_namespace) while the RHS uses model, causing many-to-one mismatches
with group_left/group_right; fix by either (A) using a request/error metric that
includes the model label (replace usages of
authorized_calls/limited_calls/authorized_hits in the expressions with the
model-bearing metric) so vector matching aligns on model, or (B) remove model
from the RHS aggregation (aggregate to sum by (user, subscription,
limitador_namespace) or sum by (user, subscription, model) consistently) and
then use a safe one-to-one match (keep group_left only when left has higher
cardinality) so authorized_calls, limited_calls, authorized_hits and the
increase(...) expressions share the same label set for matching.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: b9c61553-c149-4a9b-9576-7acc3e8d775d

📥 Commits

Reviewing files that changed from the base of the PR and between 91eac54 and 3513ada.

📒 Files selected for processing (1)

deployment/components/observability/observability/dashboards/usage-dashboard.yaml

coderabbitai

♻️ Duplicate comments (3)

deployment/components/observability/observability/dashboards/usage-dashboard.yaml (3)

108-142: ⚠️ Potential issue | 🟠 Major

Zero-fill the authorized side before computing Success Rate.

When authorized_calls has no matching series but limited_calls does, the unfilled authorized sum(...) makes both the numerator and denominator empty, so the query falls through to or vector(1) and reports 100% instead of 0%.

Proposed fix

                   query: |-
                     (
                       (
-                        sum(
-                          sum by (user, subscription, limitador_namespace) (
-                            increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
+                        (
+                          sum(
+                            sum by (user, subscription, limitador_namespace) (
+                              increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
+                            )
+                            * on(user, subscription, limitador_namespace)
+                            (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
                           )
-                          * on(user, subscription, limitador_namespace)
-                          (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
+                          or vector(0)
                         )
                       )
                       /
                       (
                         (
-                          sum(
-                            sum by (user, subscription, limitador_namespace) (
-                              increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
+                          (
+                            sum(
+                              sum by (user, subscription, limitador_namespace) (
+                                increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])
+                              )
+                              * on(user, subscription, limitador_namespace)
+                              (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
                             )
-                            * on(user, subscription, limitador_namespace)
-                            (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1)
+                            or vector(0)
                           )
                           +
                           (

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`
around lines 108 - 142, The authorized side of the Success Rate expression can
be empty when only limited_calls exist, causing both numerator and denominator
to be missing and the whole expression to fall through to "or vector(1)"; modify
the authorized sum/increase expression(s) that reference authorized_calls (and
the grouped sum by user, subscription, limitador_namespace) to zero-fill when
empty (e.g., append an "or vector(0)" analogous to the limited_calls block) so
the numerator and denominator evaluate to 0 instead of empty; ensure the
zero-fill is applied consistently to the
sum(increase(authorized_calls{...}[$__range]) ...) terms and preserves the
existing on(...) and max_over_time(authorized_hits{model=~"$model"}) logic.

217-225: ⚠️ Potential issue | 🟠 Major

Preserve limited-only request rows in the table aggregation.

(authorized + limited) or authorized recovers authorized-only rows, but still drops rows that only have limited_calls. Add the limited side as a fallback, or zero-fill both operands before adding.

Proposed fix

                         (
                           (
                             sum by (user, subscription, limitador_namespace) (increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]))
                             +
                             sum by (user, subscription, limitador_namespace) (increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]))
                           )
                           or
                           sum by (user, subscription, limitador_namespace) (increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]))
+                          or
+                          sum by (user, subscription, limitador_namespace) (increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]))
                         )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`
around lines 217 - 225, The current PromQL expression "(authorized_calls +
limited_calls) or authorized_calls" still drops rows that only have
limited_calls; update the expression to preserve limited-only rows by using "(
(sum by (user, subscription, limitador_namespace)
(increase(authorized_calls{user!=\"\", user=~\"$user\",
subscription=~\"$subscription\"}[$__range])) + sum by (user, subscription,
limitador_namespace) (increase(limited_calls{user!=\"\", user=~\"$user\",
subscription=~\"$subscription\"}[$__range])) ) or sum by (user, subscription,
limitador_namespace) (increase(authorized_calls{user!=\"\", user=~\"$user\",
subscription=~\"$subscription\"}[$__range])) or sum by (user, subscription,
limitador_namespace) (increase(limited_calls{user!=\"\", user=~\"$user\",
subscription=~\"$subscription\"}[$__range]))" so that authorized-only,
limited-only, and combined rows are all retained; alternatively zero-fill both
operands with or-zero (e.g., using or vector(0)) before adding to avoid dropped
series when summing. Ensure you update the aggregation lines that reference
increase(authorized_calls...) and increase(limited_calls...) grouped by user,
subscription, limitador_namespace.

278-279: ⚠️ Potential issue | 🟠 Major

These totals are still tuple-scoped, not model-scoped.

The authorized_hits{model=~"$model"} multiplier only proves that a (user, subscription, limitador_namespace) tuple has selected-model hits; it does not partition authorized_calls or limited_calls by model. Mixed-model tuples will include non-selected model requests/errors, while rate-limited-only model traffic can still be dropped. Use model-labeled request/error counters directly, or add model-scoped recording metrics before presenting these as per-model totals.

#!/usr/bin/env bash
set -euo pipefail

# Read-only runtime check. Set PROM_URL to the Prometheus base URL, e.g. https://prometheus.example.com
: "${PROM_URL:?Set PROM_URL to the Prometheus base URL}"

start="$(date -u -d '7 days ago' +%s)"
end="$(date -u +%s)"

echo "Check whether call counters carry a model label:"
for metric in authorized_calls limited_calls authorized_hits; do
  curl -fsG "$PROM_URL/api/v1/series" \
    --data-urlencode "match[]=${metric}{user!=\"\",subscription!=\"\"}" \
    --data-urlencode "start=${start}" \
    --data-urlencode "end=${end}" \
  | jq --arg metric "$metric" '{metric: $metric, series: (.data | length), model_label_values: ([.data[]?.model] | map(select(. != null)) | unique)}'
done

echo "Check for tuples that have more than one model in the selected time window:"
curl -fsG "$PROM_URL/api/v1/query" \
  --data-urlencode 'query=count by (user, subscription, limitador_namespace) (max by (user, subscription, limitador_namespace, model) (max_over_time(authorized_hits{model!=""}[7d]))) > 1' \
| jq '.data.result'

Expected: either authorized_calls/limited_calls expose model and the dashboard should filter those metrics directly, or the mixed-model tuple query returns no series. If neither is true, these totals cannot be accurately model-scoped with the current query.

Also applies to: 312-324

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`
around lines 278 - 279, The dashboard's PromQL currently multiplies tuple-scoped
authorized_hits by totals, so per-model totals remain incorrect for tuples that
carry multiple model values; update queries to use model-labeled counters (e.g.
authorized_calls{model=~"$model"}, limited_calls{model=~"$model"},
authorized_hits{model=~"$model"}) directly or add recording rules that
pre-aggregate model-scoped sums (e.g. sum by (model, user, subscription,
limitador_namespace) (...) ) and switch the dashboard to those recording metric
names; locate and replace uses of the current expression that references
authorized_hits and the tuple-multiplier (the expression with
max_over_time(authorized_hits{model=~"$model"}[$__range]) and any similar blocks
referencing authorized_calls/limited_calls) to ensure the queries are truly
model-scoped.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 108-142: The authorized side of the Success Rate expression can be
empty when only limited_calls exist, causing both numerator and denominator to
be missing and the whole expression to fall through to "or vector(1)"; modify
the authorized sum/increase expression(s) that reference authorized_calls (and
the grouped sum by user, subscription, limitador_namespace) to zero-fill when
empty (e.g., append an "or vector(0)" analogous to the limited_calls block) so
the numerator and denominator evaluate to 0 instead of empty; ensure the
zero-fill is applied consistently to the
sum(increase(authorized_calls{...}[$__range]) ...) terms and preserves the
existing on(...) and max_over_time(authorized_hits{model=~"$model"}) logic.
- Around line 217-225: The current PromQL expression "(authorized_calls +
limited_calls) or authorized_calls" still drops rows that only have
limited_calls; update the expression to preserve limited-only rows by using "(
(sum by (user, subscription, limitador_namespace)
(increase(authorized_calls{user!=\"\", user=~\"$user\",
subscription=~\"$subscription\"}[$__range])) + sum by (user, subscription,
limitador_namespace) (increase(limited_calls{user!=\"\", user=~\"$user\",
subscription=~\"$subscription\"}[$__range])) ) or sum by (user, subscription,
limitador_namespace) (increase(authorized_calls{user!=\"\", user=~\"$user\",
subscription=~\"$subscription\"}[$__range])) or sum by (user, subscription,
limitador_namespace) (increase(limited_calls{user!=\"\", user=~\"$user\",
subscription=~\"$subscription\"}[$__range]))" so that authorized-only,
limited-only, and combined rows are all retained; alternatively zero-fill both
operands with or-zero (e.g., using or vector(0)) before adding to avoid dropped
series when summing. Ensure you update the aggregation lines that reference
increase(authorized_calls...) and increase(limited_calls...) grouped by user,
subscription, limitador_namespace.
- Around line 278-279: The dashboard's PromQL currently multiplies tuple-scoped
authorized_hits by totals, so per-model totals remain incorrect for tuples that
carry multiple model values; update queries to use model-labeled counters (e.g.
authorized_calls{model=~"$model"}, limited_calls{model=~"$model"},
authorized_hits{model=~"$model"}) directly or add recording rules that
pre-aggregate model-scoped sums (e.g. sum by (model, user, subscription,
limitador_namespace) (...) ) and switch the dashboard to those recording metric
names; locate and replace uses of the current expression that references
authorized_hits and the tuple-multiplier (the expression with
max_over_time(authorized_hits{model=~"$model"}[$__range]) and any similar blocks
referencing authorized_calls/limited_calls) to ensure the queries are truly
model-scoped.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: 0ebab8fe-ec58-4156-8f8d-44ec51a23149

📥 Commits

Reviewing files that changed from the base of the PR and between 3513ada and 6473edd.

📒 Files selected for processing (1)

deployment/components/observability/observability/dashboards/usage-dashboard.yaml

The top-levels values of Total Requests, Total Errors, Success Rate, and Active Users, are now calculated also based on the selected model(s). Signed-off-by: Arik Hadas <ahadas@redhat.com>

rhods-ci-bot · 2026-04-20T15:34:34Z

@ahadas: The following test has Succeeded:

OCI Artifact Browser URL

View in Artifact Browser

Inspecting Test Artifacts Manually

To inspect your test artifacts manually, follow these steps:

Install ORAS (see the ORAS installation guide).
Download artifacts with the following commands:

mkdir -p oras-artifacts
cd oras-artifacts
oras pull quay.io/opendatahub/odh-ci-artifacts:maas-group-test-5h5ff

jland-redhat

lgtm

openshift-ci · 2026-04-21T13:19:30Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahadas, jland-redhat

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [jland-redhat]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Automated promotion of **9 commit(s)** from `main` to `stable`. ``` fb2ea25 feat: add tenant CRD to e2e artifact collection and debug report (#787) 1b8f212 chore: restrict rbac for db secret (#779) e746008 docs: add/update documentation for Maas Tenant (#773) 147eaa2 fix: per-model(s) top-level values in usage dashboard (#772) b327b34 feat: add OIDC token support for model discovery via /v1/models (#703) dbf6d03 fix: validate token rate limits and skip invalid subs in TRLP aggregation (#752) fae753e chore: add .worktrees/ to .gitignore (#774) c01dc5b fix: minor updates for external model (#771) 65ca551 fix: add explicit command to v0.8.2 simulator models to prevent bash … (#765) ```

Automated promotion of **11 commit(s)** from `stable` to `rhoai`. ``` fb2ea25 feat: add tenant CRD to e2e artifact collection and debug report (#787) 1b8f212 chore: restrict rbac for db secret (#779) e746008 docs: add/update documentation for Maas Tenant (#773) 147eaa2 fix: per-model(s) top-level values in usage dashboard (#772) b327b34 feat: add OIDC token support for model discovery via /v1/models (#703) dbf6d03 fix: validate token rate limits and skip invalid subs in TRLP aggregation (#752) 89fba29 chore: promote main to stable (#770) fae753e chore: add .worktrees/ to .gitignore (#774) c01dc5b fix: minor updates for external model (#771) 65ca551 fix: add explicit command to v0.8.2 simulator models to prevent bash … (#765) ```

openshift-ci Bot requested review from dmytro-zaharnytskyi and jland-redhat April 19, 2026 20:06

coderabbitai Bot reviewed Apr 19, 2026

View reviewed changes

Comment thread deployment/components/observability/observability/dashboards/usage-dashboard.yaml

coderabbitai Bot reviewed Apr 19, 2026

View reviewed changes

Comment thread deployment/components/observability/observability/dashboards/usage-dashboard.yaml

coderabbitai Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread deployment/components/observability/observability/dashboards/usage-dashboard.yaml

coderabbitai Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread deployment/components/observability/observability/dashboards/usage-dashboard.yaml Outdated

ahadas force-pushed the usage_per_models branch from 3513ada to 6473edd Compare April 20, 2026 10:47

coderabbitai Bot reviewed Apr 20, 2026

View reviewed changes

ahadas marked this pull request as draft April 20, 2026 11:19

openshift-ci Bot added the do-not-merge/work-in-progress label Apr 20, 2026

ahadas force-pushed the usage_per_models branch from 6473edd to d6c3728 Compare April 20, 2026 14:10

fix: per-model(s) top-level values in usage dashboard

c3d731f

The top-levels values of Total Requests, Total Errors, Success Rate, and Active Users, are now calculated also based on the selected model(s). Signed-off-by: Arik Hadas <ahadas@redhat.com>

ahadas force-pushed the usage_per_models branch from d6c3728 to c3d731f Compare April 20, 2026 14:35

ahadas marked this pull request as ready for review April 20, 2026 14:39

openshift-ci Bot removed the do-not-merge/work-in-progress label Apr 20, 2026

openshift-ci Bot requested review from SB159 and noyitz April 20, 2026 14:39

jland-redhat approved these changes Apr 21, 2026

View reviewed changes

openshift-ci Bot assigned jland-redhat Apr 21, 2026

openshift-ci Bot added the lgtm label Apr 21, 2026

openshift-ci Bot added the approved label Apr 21, 2026

openshift-merge-bot Bot merged commit 147eaa2 into opendatahub-io:main Apr 21, 2026
11 checks passed

ahadas deleted the usage_per_models branch April 21, 2026 13:35

chaitanya1731 mentioned this pull request Apr 22, 2026

chore: sync rhoai-3.4 with main red-hat-data-services/models-as-a-service#392

Merged

Conversation

ahadas commented Apr 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Merge criteria:

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Estimated code review effort

Issues

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

rhods-ci-bot commented Apr 20, 2026

OCI Artifact Browser URL

Inspecting Test Artifacts Manually

Uh oh!

jland-redhat left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ahadas commented Apr 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 19, 2026 •

edited

Loading