fix: per-model(s) top-level values in usage dashboard#772
fix: per-model(s) top-level values in usage dashboard#772openshift-merge-bot[bot] merged 1 commit intoopendatahub-io:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis change rewrites Prometheus queries in the Usage dashboard: Active Users now counts distinct users from Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Issues
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 308-329: The current join uses "* on(user, subscription,
limitador_namespace) group_left(model)" which replicates left-side aggregates
per model and inflates totals; update the expression so the right-side
(authorized_hits / limited_hits) is reduced to the same join key before
multiplying (e.g. replace the group_left(model) join with a 1:1 multiplication
where the right operand is aggregated with max by (user, subscription,
limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range]))
or the equivalent for limited_calls), and apply the identical change for
totalErrors (limited_calls), successRate occurrences and activeUsers spots
called out in the comment to ensure no per-model duplication.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro Plus
Run ID: 44fd8fcb-f184-4259-89ba-778f35a752be
📒 Files selected for processing (1)
deployment/components/observability/observability/dashboards/usage-dashboard.yaml
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
deployment/components/observability/observability/dashboards/usage-dashboard.yaml (2)
228-229:⚠️ Potential issue | 🟡 Minor
group_left(model)still present intokenConsumptionByUser— intentional, but worth double-checking.The PR drops
group_left(model)from the four top-level panels, which correctly resolves the prior inflation bug. It is retained here (lines 228–229 and 246–247) because this table legitimately needs themodellabel fanned onto the call aggregates for per-model rows. That is the correct shape for this panel, but it means the same caveat as above applies: a user's call total is replicated to every model they touched in-range, so summing the Requests/Rate-Limited columns in this table across models will not equal the Total Requests/Total Errors stat panels. Confirm this divergence is acceptable UX, or add a note to the panel description.Also applies to: 246-247
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deployment/components/observability/observability/dashboards/usage-dashboard.yaml` around lines 228 - 229, The tokenConsumptionByUser table intentionally keeps group_left(model) (see the expression using "group_left(model)" and the metric authorized_hits) which fans the user's total onto each model row, causing per-model rows to replicate a user's total so summing Requests/Rate-Limited across models won't match the overall Total Requests/Total Errors panels; update the tokenConsumptionByUser panel description (or add a short note in its panel metadata) to explicitly state this behavior and warn users that per-model row sums will not equal the top-level totals.
84-146:⚠️ Potential issue | 🟠 MajorPer-model filtering attributes full call counts despite unlabeled metrics.
authorized_callsandlimited_callscarry nomodellabel (confirmed indeployment/components/observability/grafana/dashboard-platform-admin.yaml:627). The join at lines 90–91, 125–126, 136–137, 145–146 multiplies by(0 * max_over_time(authorized_hits{model=~"$model"}) + 1), which is a presence check: if the tuple had any hits for the selected model, the full count is included; otherwise zero.Consequence: A user with 80 calls under model A and 20 under model B (total 100) will contribute 100 to totals when
$model=Ais selected, not 80. Sums across individual models exceed the "all models" total.Either document this limitation in panel descriptions, or compute proportional attribution—multiply the tuple's call total by
sum by(user,subscription,limitador_namespace)(authorized_hits{model=~"$model"}) / sum by(user,subscription,limitador_namespace)(authorized_hits)so reported totals reflect actual model distribution.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deployment/components/observability/observability/dashboards/usage-dashboard.yaml` around lines 84 - 146, The panels (e.g., Success Rate and the user count query) currently use the presence-check multiplier "(0 * max_over_time(authorized_hits{model=~\"$model\"}[$__range]) + 1)" which causes full call counts from authorized_calls/limited_calls to be included when any hit exists for the model; replace each occurrence of that presence-check (the exact token appears in the Prometheus queries) with a proportional attribution expression that multiplies call totals by the model share, e.g. replace with "sum by(user,subscription,limitador_namespace)(authorized_hits{model=~\"$model\"}[$__range]) / sum by(user,subscription,limitador_namespace)(authorized_hits[$__range])" (apply the same pattern for limited_calls where the multiplier is used) and ensure you guard against divide-by-zero by coalescing the denominator (e.g., add a small zero-default or wrap with or vector(0)) so totals reflect actual per-model distribution rather than an existence check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 275-283: The current inner-join multiplier uses 0 * max by (user,
subscription, limitador_namespace)
(max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1 which drops
(user, subscription, limitador_namespace) tuples that have only limited_calls
and no authorized_hits; update that expression so the tuple selector is derived
from the union of authorized_hits and limited_calls series (e.g. replace the
max_over_time(authorized_hits{model=~"$model"}[$__range]) call with an OR/union
of max_over_time(authorized_hits{model=~"$model"}[$__range]) or
max_over_time(limited_calls{user!="", user=~"$user",
subscription=~"$subscription"}[$__range]) or use
max(max_over_time(...authorized_hits...), max_over_time(...limited_calls...)) so
rate-limited-only users remain in the join and their errors/success-rate
denominator are preserved).
---
Outside diff comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 228-229: The tokenConsumptionByUser table intentionally keeps
group_left(model) (see the expression using "group_left(model)" and the metric
authorized_hits) which fans the user's total onto each model row, causing
per-model rows to replicate a user's total so summing Requests/Rate-Limited
across models won't match the overall Total Requests/Total Errors panels; update
the tokenConsumptionByUser panel description (or add a short note in its panel
metadata) to explicitly state this behavior and warn users that per-model row
sums will not equal the top-level totals.
- Around line 84-146: The panels (e.g., Success Rate and the user count query)
currently use the presence-check multiplier "(0 *
max_over_time(authorized_hits{model=~\"$model\"}[$__range]) + 1)" which causes
full call counts from authorized_calls/limited_calls to be included when any hit
exists for the model; replace each occurrence of that presence-check (the exact
token appears in the Prometheus queries) with a proportional attribution
expression that multiplies call totals by the model share, e.g. replace with
"sum
by(user,subscription,limitador_namespace)(authorized_hits{model=~\"$model\"}[$__range])
/ sum by(user,subscription,limitador_namespace)(authorized_hits[$__range])"
(apply the same pattern for limited_calls where the multiplier is used) and
ensure you guard against divide-by-zero by coalescing the denominator (e.g., add
a small zero-default or wrap with or vector(0)) so totals reflect actual
per-model distribution rather than an existence check.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro Plus
Run ID: d7fee6df-f04b-44bb-9b93-43af4a68e8ec
📒 Files selected for processing (1)
deployment/components/observability/observability/dashboards/usage-dashboard.yaml
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
deployment/components/observability/observability/dashboards/usage-dashboard.yaml (1)
132-172:⚠️ Potential issue | 🟠 MajorPrevent limited-only traffic from defaulting to 100% success.
When the authorized side is empty, the denominator addition becomes empty and the final
or vector(1)reports success as1. Wrap the authorized sums withor vector(0)before the division/addition.Proposed fix
- sum( - sum by (user, subscription, limitador_namespace) ( - increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]) - ) - * on(user, subscription, limitador_namespace) - (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1) - ) + ( + sum( + sum by (user, subscription, limitador_namespace) ( + increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]) + ) + * on(user, subscription, limitador_namespace) + (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1) + ) + or vector(0) + ) @@ - sum( - sum by (user, subscription, limitador_namespace) ( - increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]) - ) - * on(user, subscription, limitador_namespace) - (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1) - ) + ( + sum( + sum by (user, subscription, limitador_namespace) ( + increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]) + ) + * on(user, subscription, limitador_namespace) + (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1) + ) + or vector(0) + )Verify with a selected model/range where
limited_calls > 0andauthorized_callsis absent; expected success rate is0, not1.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deployment/components/observability/observability/dashboards/usage-dashboard.yaml` around lines 132 - 172, The current PromQL can return 1 when authorized_calls are absent because the authorized sums evaluate to empty; to fix this, wrap the authorized-side sum expressions (the sum by (...) (increase(authorized_calls{...}[$__range])) blocks and any compound sum that uses authorized_hits) with "or vector(0)" so they yield 0 instead of empty before the division/addition; update the expressions referencing authorized_calls and the authorized_hits-derived max_over_time aggregates (used with on(user, subscription, limitador_namespace)) to apply "or vector(0)" and ensure the denominator calculation includes those zero values so a case with only limited_calls returns 0 success rather than defaulting to 1.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 87-95: The expression that adds summed authorized_calls and
limited_calls loses labelsets because PromQL + does an inner-join; update the
two summed subexpressions (the sum by (user, subscription, limitador_namespace)
over increase(authorized_calls...) and the sum by (...) over
increase(limited_calls...)) to zero-fill missing series by appending "or
vector(0)" to each sum so unmatched users are preserved before the addition;
apply the same zero-fill pattern to the similar tokenConsumption aggregation
noted in the review and keep totalRequests as-is since it already handles or
vector(0).
---
Duplicate comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 132-172: The current PromQL can return 1 when authorized_calls are
absent because the authorized sums evaluate to empty; to fix this, wrap the
authorized-side sum expressions (the sum by (...)
(increase(authorized_calls{...}[$__range])) blocks and any compound sum that
uses authorized_hits) with "or vector(0)" so they yield 0 instead of empty
before the division/addition; update the expressions referencing
authorized_calls and the authorized_hits-derived max_over_time aggregates (used
with on(user, subscription, limitador_namespace)) to apply "or vector(0)" and
ensure the denominator calculation includes those zero values so a case with
only limited_calls returns 0 success rather than defaulting to 1.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro Plus
Run ID: 3f955c0b-9d46-4fa0-a7ad-88da58115c44
📒 Files selected for processing (1)
deployment/components/observability/observability/dashboards/usage-dashboard.yaml
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
deployment/components/observability/observability/dashboards/usage-dashboard.yaml (2)
141-182:⚠️ Potential issue | 🟠 MajorZero-fill the authorized side before the success-rate division.
For rate-limited-only traffic,
sum(authorized_calls...)is empty; PromQL vector arithmetic then drops the limited-only denominator, and the trailingor vector(1)reports 100% instead of 0%.Proposed fix
query: |- ( ( - sum( - sum by (user, subscription, limitador_namespace) ( - increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]) + ( + sum( + sum by (user, subscription, limitador_namespace) ( + increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]) + ) + * on(user, subscription, limitador_namespace) + (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1) ) - * on(user, subscription, limitador_namespace) - (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1) + or vector(0) ) ) / ( ( - sum( - sum by (user, subscription, limitador_namespace) ( - increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]) + ( + sum( + sum by (user, subscription, limitador_namespace) ( + increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]) + ) + * on(user, subscription, limitador_namespace) + (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1) ) - * on(user, subscription, limitador_namespace) - (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1) + or vector(0) ) + (Verify with a selected model/user that has
limited_calls > 0andauthorized_calls == 0; expected success rate is0, not1.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deployment/components/observability/observability/dashboards/usage-dashboard.yaml` around lines 141 - 182, The success-rate query returns 1 for limited-only traffic because the left-hand sum of authorized_calls is absent and PromQL drops that vector during division; update the query to zero-fill the authorized side by adding an explicit fallback (e.g., apply "or vector(0)" to the aggregate of sum by (user, subscription, limitador_namespace) (increase(authorized_calls...[$__range])) or otherwise wrap that authorized_calls sum with an "or vector(0)" before the multiplication and division so users/models with limited_calls>0 but authorized_calls==0 yield a 0% success rate instead of 100%; adjust the corresponding expressions that reference authorized_calls (and their multiplied form) to use this zero-filled result.
257-303:⚠️ Potential issue | 🟠 MajorFix the table joins before multi-model tuples break or misattribute rows.
The LHS is one series per
(user, subscription, limitador_namespace), while the RHS can be multiple series per same key due tomodel. PromQLgroup_leftrequires the left vector to have higher cardinality, so multi-model RHS matches can fail; switching togroup_rightwould still copy the same model-less call count to every model. Use a request/error metric that carriesmodel, or aggregate these columns withoutmodel. See Prometheus vector matching docs: https://prometheus.io/docs/prometheus/latest/querying/operators/#many-to-one-and-one-to-many-vector-matchesRead-only verification:
#!/bin/bash # Confirm whether the call counters actually carry a model label in instrumentation/usages. # Expected safe result: authorized_calls and limited_calls are emitted with a model label, # or this table should not present them as per-model values. rg -n -C4 '\b(authorized_calls|limited_calls|authorized_hits)\b'Runtime PromQL check:
count by (user, subscription, limitador_namespace) ( max by (user, subscription, limitador_namespace, model) ( max_over_time(authorized_hits{model=~"$model"}[$__range]) ) ) > 1Any returned series means this table join is unsafe for that selected model set.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deployment/components/observability/observability/dashboards/usage-dashboard.yaml` around lines 257 - 303, The joins are unsafe because authorized_calls/limited_calls are grouped by (user, subscription, limitador_namespace) while the RHS uses model, causing many-to-one mismatches with group_left/group_right; fix by either (A) using a request/error metric that includes the model label (replace usages of authorized_calls/limited_calls/authorized_hits in the expressions with the model-bearing metric) so vector matching aligns on model, or (B) remove model from the RHS aggregation (aggregate to sum by (user, subscription, limitador_namespace) or sum by (user, subscription, model) consistently) and then use a safe one-to-one match (keep group_left only when left has higher cardinality) so authorized_calls, limited_calls, authorized_hits and the increase(...) expressions share the same label set for matching.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 108-112: The queries are incorrectly filtering by model because
they rely on tuple existence instead of joining the model label from
authorized_hits; update the affected panels to join the model label using a
group_left(model) pattern so that authorized_calls/limited_calls are multiplied
only for the selected model. Specifically, for metrics authorized_calls and
limited_calls, replace the current tuple-existence expression with a join like:
sum by (user,subscription,limitador_namespace)(increase(authorized_calls{...}))
* on(user,subscription,limitador_namespace) group_left(model) (0 * max by
(user,subscription,limitador_namespace,model)(max_over_time(authorized_hits{model=~"$model"}[$__range]))
+ 1) (same for limited_calls using limited_calls and limited_calls metric
names), and apply this change to the panels referenced (the blocks around the
shown diff and the other panels noted) so the model filter is enforced via
group_left(model).
---
Duplicate comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 141-182: The success-rate query returns 1 for limited-only traffic
because the left-hand sum of authorized_calls is absent and PromQL drops that
vector during division; update the query to zero-fill the authorized side by
adding an explicit fallback (e.g., apply "or vector(0)" to the aggregate of sum
by (user, subscription, limitador_namespace)
(increase(authorized_calls...[$__range])) or otherwise wrap that
authorized_calls sum with an "or vector(0)" before the multiplication and
division so users/models with limited_calls>0 but authorized_calls==0 yield a 0%
success rate instead of 100%; adjust the corresponding expressions that
reference authorized_calls (and their multiplied form) to use this zero-filled
result.
- Around line 257-303: The joins are unsafe because
authorized_calls/limited_calls are grouped by (user, subscription,
limitador_namespace) while the RHS uses model, causing many-to-one mismatches
with group_left/group_right; fix by either (A) using a request/error metric that
includes the model label (replace usages of
authorized_calls/limited_calls/authorized_hits in the expressions with the
model-bearing metric) so vector matching aligns on model, or (B) remove model
from the RHS aggregation (aggregate to sum by (user, subscription,
limitador_namespace) or sum by (user, subscription, model) consistently) and
then use a safe one-to-one match (keep group_left only when left has higher
cardinality) so authorized_calls, limited_calls, authorized_hits and the
increase(...) expressions share the same label set for matching.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro Plus
Run ID: b9c61553-c149-4a9b-9576-7acc3e8d775d
📒 Files selected for processing (1)
deployment/components/observability/observability/dashboards/usage-dashboard.yaml
3513ada to
6473edd
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (3)
deployment/components/observability/observability/dashboards/usage-dashboard.yaml (3)
108-142:⚠️ Potential issue | 🟠 MajorZero-fill the authorized side before computing Success Rate.
When
authorized_callshas no matching series butlimited_callsdoes, the unfilled authorizedsum(...)makes both the numerator and denominator empty, so the query falls through toor vector(1)and reports 100% instead of 0%.Proposed fix
query: |- ( ( - sum( - sum by (user, subscription, limitador_namespace) ( - increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]) + ( + sum( + sum by (user, subscription, limitador_namespace) ( + increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]) + ) + * on(user, subscription, limitador_namespace) + (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1) ) - * on(user, subscription, limitador_namespace) - (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1) + or vector(0) ) ) / ( ( - sum( - sum by (user, subscription, limitador_namespace) ( - increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]) + ( + sum( + sum by (user, subscription, limitador_namespace) ( + increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range]) + ) + * on(user, subscription, limitador_namespace) + (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1) ) - * on(user, subscription, limitador_namespace) - (0 * max by (user, subscription, limitador_namespace) (max_over_time(authorized_hits{model=~"$model"}[$__range])) + 1) + or vector(0) ) + (🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deployment/components/observability/observability/dashboards/usage-dashboard.yaml` around lines 108 - 142, The authorized side of the Success Rate expression can be empty when only limited_calls exist, causing both numerator and denominator to be missing and the whole expression to fall through to "or vector(1)"; modify the authorized sum/increase expression(s) that reference authorized_calls (and the grouped sum by user, subscription, limitador_namespace) to zero-fill when empty (e.g., append an "or vector(0)" analogous to the limited_calls block) so the numerator and denominator evaluate to 0 instead of empty; ensure the zero-fill is applied consistently to the sum(increase(authorized_calls{...}[$__range]) ...) terms and preserves the existing on(...) and max_over_time(authorized_hits{model=~"$model"}) logic.
217-225:⚠️ Potential issue | 🟠 MajorPreserve limited-only request rows in the table aggregation.
(authorized + limited) or authorizedrecovers authorized-only rows, but still drops rows that only havelimited_calls. Add the limited side as a fallback, or zero-fill both operands before adding.Proposed fix
( ( sum by (user, subscription, limitador_namespace) (increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])) + sum by (user, subscription, limitador_namespace) (increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])) ) or sum by (user, subscription, limitador_namespace) (increase(authorized_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])) + or + sum by (user, subscription, limitador_namespace) (increase(limited_calls{user!="", user=~"$user", subscription=~"$subscription"}[$__range])) )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deployment/components/observability/observability/dashboards/usage-dashboard.yaml` around lines 217 - 225, The current PromQL expression "(authorized_calls + limited_calls) or authorized_calls" still drops rows that only have limited_calls; update the expression to preserve limited-only rows by using "( (sum by (user, subscription, limitador_namespace) (increase(authorized_calls{user!=\"\", user=~\"$user\", subscription=~\"$subscription\"}[$__range])) + sum by (user, subscription, limitador_namespace) (increase(limited_calls{user!=\"\", user=~\"$user\", subscription=~\"$subscription\"}[$__range])) ) or sum by (user, subscription, limitador_namespace) (increase(authorized_calls{user!=\"\", user=~\"$user\", subscription=~\"$subscription\"}[$__range])) or sum by (user, subscription, limitador_namespace) (increase(limited_calls{user!=\"\", user=~\"$user\", subscription=~\"$subscription\"}[$__range]))" so that authorized-only, limited-only, and combined rows are all retained; alternatively zero-fill both operands with or-zero (e.g., using or vector(0)) before adding to avoid dropped series when summing. Ensure you update the aggregation lines that reference increase(authorized_calls...) and increase(limited_calls...) grouped by user, subscription, limitador_namespace.
278-279:⚠️ Potential issue | 🟠 MajorThese totals are still tuple-scoped, not model-scoped.
The
authorized_hits{model=~"$model"}multiplier only proves that a(user, subscription, limitador_namespace)tuple has selected-model hits; it does not partitionauthorized_callsorlimited_callsby model. Mixed-model tuples will include non-selected model requests/errors, while rate-limited-only model traffic can still be dropped. Use model-labeled request/error counters directly, or add model-scoped recording metrics before presenting these as per-model totals.#!/usr/bin/env bash set -euo pipefail # Read-only runtime check. Set PROM_URL to the Prometheus base URL, e.g. https://prometheus.example.com : "${PROM_URL:?Set PROM_URL to the Prometheus base URL}" start="$(date -u -d '7 days ago' +%s)" end="$(date -u +%s)" echo "Check whether call counters carry a model label:" for metric in authorized_calls limited_calls authorized_hits; do curl -fsG "$PROM_URL/api/v1/series" \ --data-urlencode "match[]=${metric}{user!=\"\",subscription!=\"\"}" \ --data-urlencode "start=${start}" \ --data-urlencode "end=${end}" \ | jq --arg metric "$metric" '{metric: $metric, series: (.data | length), model_label_values: ([.data[]?.model] | map(select(. != null)) | unique)}' done echo "Check for tuples that have more than one model in the selected time window:" curl -fsG "$PROM_URL/api/v1/query" \ --data-urlencode 'query=count by (user, subscription, limitador_namespace) (max by (user, subscription, limitador_namespace, model) (max_over_time(authorized_hits{model!=""}[7d]))) > 1' \ | jq '.data.result'Expected: either
authorized_calls/limited_callsexposemodeland the dashboard should filter those metrics directly, or the mixed-model tuple query returns no series. If neither is true, these totals cannot be accurately model-scoped with the current query.Also applies to: 312-324
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deployment/components/observability/observability/dashboards/usage-dashboard.yaml` around lines 278 - 279, The dashboard's PromQL currently multiplies tuple-scoped authorized_hits by totals, so per-model totals remain incorrect for tuples that carry multiple model values; update queries to use model-labeled counters (e.g. authorized_calls{model=~"$model"}, limited_calls{model=~"$model"}, authorized_hits{model=~"$model"}) directly or add recording rules that pre-aggregate model-scoped sums (e.g. sum by (model, user, subscription, limitador_namespace) (...) ) and switch the dashboard to those recording metric names; locate and replace uses of the current expression that references authorized_hits and the tuple-multiplier (the expression with max_over_time(authorized_hits{model=~"$model"}[$__range]) and any similar blocks referencing authorized_calls/limited_calls) to ensure the queries are truly model-scoped.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@deployment/components/observability/observability/dashboards/usage-dashboard.yaml`:
- Around line 108-142: The authorized side of the Success Rate expression can be
empty when only limited_calls exist, causing both numerator and denominator to
be missing and the whole expression to fall through to "or vector(1)"; modify
the authorized sum/increase expression(s) that reference authorized_calls (and
the grouped sum by user, subscription, limitador_namespace) to zero-fill when
empty (e.g., append an "or vector(0)" analogous to the limited_calls block) so
the numerator and denominator evaluate to 0 instead of empty; ensure the
zero-fill is applied consistently to the
sum(increase(authorized_calls{...}[$__range]) ...) terms and preserves the
existing on(...) and max_over_time(authorized_hits{model=~"$model"}) logic.
- Around line 217-225: The current PromQL expression "(authorized_calls +
limited_calls) or authorized_calls" still drops rows that only have
limited_calls; update the expression to preserve limited-only rows by using "(
(sum by (user, subscription, limitador_namespace)
(increase(authorized_calls{user!=\"\", user=~\"$user\",
subscription=~\"$subscription\"}[$__range])) + sum by (user, subscription,
limitador_namespace) (increase(limited_calls{user!=\"\", user=~\"$user\",
subscription=~\"$subscription\"}[$__range])) ) or sum by (user, subscription,
limitador_namespace) (increase(authorized_calls{user!=\"\", user=~\"$user\",
subscription=~\"$subscription\"}[$__range])) or sum by (user, subscription,
limitador_namespace) (increase(limited_calls{user!=\"\", user=~\"$user\",
subscription=~\"$subscription\"}[$__range]))" so that authorized-only,
limited-only, and combined rows are all retained; alternatively zero-fill both
operands with or-zero (e.g., using or vector(0)) before adding to avoid dropped
series when summing. Ensure you update the aggregation lines that reference
increase(authorized_calls...) and increase(limited_calls...) grouped by user,
subscription, limitador_namespace.
- Around line 278-279: The dashboard's PromQL currently multiplies tuple-scoped
authorized_hits by totals, so per-model totals remain incorrect for tuples that
carry multiple model values; update queries to use model-labeled counters (e.g.
authorized_calls{model=~"$model"}, limited_calls{model=~"$model"},
authorized_hits{model=~"$model"}) directly or add recording rules that
pre-aggregate model-scoped sums (e.g. sum by (model, user, subscription,
limitador_namespace) (...) ) and switch the dashboard to those recording metric
names; locate and replace uses of the current expression that references
authorized_hits and the tuple-multiplier (the expression with
max_over_time(authorized_hits{model=~"$model"}[$__range]) and any similar blocks
referencing authorized_calls/limited_calls) to ensure the queries are truly
model-scoped.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro Plus
Run ID: 0ebab8fe-ec58-4156-8f8d-44ec51a23149
📒 Files selected for processing (1)
deployment/components/observability/observability/dashboards/usage-dashboard.yaml
6473edd to
d6c3728
Compare
The top-levels values of Total Requests, Total Errors, Success Rate, and Active Users, are now calculated also based on the selected model(s). Signed-off-by: Arik Hadas <ahadas@redhat.com>
d6c3728 to
c3d731f
Compare
|
@ahadas: The following test has Succeeded: OCI Artifact Browser URLInspecting Test Artifacts ManuallyTo inspect your test artifacts manually, follow these steps:
mkdir -p oras-artifacts
cd oras-artifacts
oras pull quay.io/opendatahub/odh-ci-artifacts:maas-group-test-5h5ff |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahadas, jland-redhat The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
147eaa2
into
opendatahub-io:main
Automated promotion of **9 commit(s)** from `main` to `stable`. ``` fb2ea25 feat: add tenant CRD to e2e artifact collection and debug report (#787) 1b8f212 chore: restrict rbac for db secret (#779) e746008 docs: add/update documentation for Maas Tenant (#773) 147eaa2 fix: per-model(s) top-level values in usage dashboard (#772) b327b34 feat: add OIDC token support for model discovery via /v1/models (#703) dbf6d03 fix: validate token rate limits and skip invalid subs in TRLP aggregation (#752) fae753e chore: add .worktrees/ to .gitignore (#774) c01dc5b fix: minor updates for external model (#771) 65ca551 fix: add explicit command to v0.8.2 simulator models to prevent bash … (#765) ```
Automated promotion of **11 commit(s)** from `stable` to `rhoai`. ``` fb2ea25 feat: add tenant CRD to e2e artifact collection and debug report (#787) 1b8f212 chore: restrict rbac for db secret (#779) e746008 docs: add/update documentation for Maas Tenant (#773) 147eaa2 fix: per-model(s) top-level values in usage dashboard (#772) b327b34 feat: add OIDC token support for model discovery via /v1/models (#703) dbf6d03 fix: validate token rate limits and skip invalid subs in TRLP aggregation (#752) 89fba29 chore: promote main to stable (#770) fae753e chore: add .worktrees/ to .gitignore (#774) c01dc5b fix: minor updates for external model (#771) 65ca551 fix: add explicit command to v0.8.2 simulator models to prevent bash … (#765) ```
Description
The top-levels values of Total Requests, Total Errors, Success Rate, and Active Users, are now calculated also based on the selected model(s).
How Has This Been Tested?
Tested manually
Merge criteria:
Summary by CodeRabbit