What type of bug is this?
Unexpected error
What subsystems are affected?
Frontend
Minimal reproduce step
Description
Binary comparison operators (>=, >, <, etc.) and arithmetic operators (+, -) applied to histogram_quantile()
results fail with a DataFusion internal error. The histogram_quantile() function itself evaluates correctly, but any
binary operation on its output triggers a field resolution failure.
Environment
- GreptimeDB version: v1.0.0-beta.4
- Query interface: Prometheus-compatible API (
/v1/prometheus/api/v1/query)
- Client: vmalert v1.102.0
Steps to Reproduce
Execute the following PromQL queries in order:
# Step 1 - Works ✅
rate(inference_time_per_output_token_seconds_bucket[1m])
# Step 2 - Works ✅
sum by (le, infr_svc_uid) (rate(inference_time_per_output_token_seconds_bucket[1m]))
# Step 3 - Works ✅
histogram_quantile(0.5, sum by (le, infr_svc_uid) (rate(inference_time_per_output_token_seconds_bucket[1m])))
# Step 4 - Fails ❌
histogram_quantile(0.5, sum by (le, infr_svc_uid) (rate(inference_time_per_output_token_seconds_bucket[1m]))) >= 0.02
# Step 5 - Also fails ❌
histogram_quantile(0.5, sum by (le, infr_svc_uid) (rate(inference_time_per_output_token_seconds_bucket[1m]))) + 0
Error Messages
Step 4 & 5 (direct comparison/arithmetic):
{
"status": "error",
"error": "Internal error during building DataFusion plan: No field named
\"sum(prom_rate(time_range,value,time,Int64(60000)))\".",
"errorType": "PlanQuery"
}
When wrapped in a subquery (via vmalert):
count_over_time((
histogram_quantile(0.5, sum by (le, infr_svc_uid) (rate(inference_time_per_output_token_seconds_bucket[1m]))) >= 0.02
)[10m:1m])
{
"status": "error",
"error": "Internal error during building DataFusion plan: No field named
inference_time_per_output_token_seconds_bucket.infr_svc_uid. Did you mean 'infr_svc_uid'?.",
"errorType": "PlanQuery"
}
Expected Behavior
histogram_quantile(...) >= threshold should return a boolean filtered time series, consistent with standard PromQL behavior
(as in Prometheus/VictoriaMetrics).
Actual Behavior
DataFusion fails to resolve the output field name of histogram_quantile() when used as input to a binary operator. The
internal field reference appears to retain the raw function call signature or the source table name as a prefix, rather than
resolving to the aggregated output.
Workaround
Using cumulative bucket ratios to approximate percentile threshold checks:
# Equivalent to histogram_quantile(0.5, ...) >= 0.02
sum by (infr_svc_uid) (rate(metric_bucket{le="0.02"}[1m]))
/ sum by (infr_svc_uid) (rate(metric_bucket{le="+Inf"}[1m])) < 0.5
Use Case
We are using vmalert to evaluate SLA alerting rules against GreptimeDB. The rules need to check if latency percentiles (p50,
p75, p90, p99) exceed defined thresholds, which requires binary comparisons on hist
What did you expect to see?
histogram_quantile(0.5, sum by (le, infr_svc_uid) (rate(inference_time_per_output_token_seconds_bucket[1m]))) >= 0.02
should return a boolean filtered time series (samples where p50 >= 0.02s), consistent with standard PromQL behavior in
Prometheus and VictoriaMetrics.
What did you see instead?
DataFusion internal error when applying any binary operator (>=, >, +, -) to histogram_quantile() output:
- Direct comparison returns:
{"status":"error","error":"Internal error during building DataFusion plan: No field named
\"sum(prom_rate(time_range,value,time,Int64(60000)))\".", "errorType":"PlanQuery"}
- When wrapped in a subquery via vmalert:
{"status":"error","error":"Internal error during building DataFusion plan: No field named
inference_time_per_output_token_seconds_bucket.infr_svc_uid. Did you mean 'infr_svc_uid'?.", "errorType":"PlanQuery"}
Note: histogram_quantile(...) alone evaluates correctly. The error only occurs when its result is used as input to a binary
operator.
What operating system did you use?
GreptimeDB v1.0.0-beta.4 deployed on Kubernetes (Linux amd64)
What version of GreptimeDB did you use?
v1.0.0-beta.4
Relevant log output and stack trace
vmalert log showing the full query and error response from GreptimeDB:
2026-05-20T10:09:59.698Z error VictoriaMetrics/app/vmalert/rule/group.go:364
group "inference-sla-critical": rule "InferenceSloViolationCritical": failed to execute:
failed to execute query "(
(count_over_time((
sum by (infr_svc_uid) (rate(lm_gateway_request_errors_total{error_code=~\"5..|429\"}[1m]))
/ sum by (infr_svc_uid) (rate(lm_gateway_requests_total[1m])) >= 0.01
)[10m:1m]) > bool 0)
+
(count_over_time((
histogram_quantile(0.5, sum by (le, infr_svc_uid) (rate(inference_time_per_output_token_seconds_bucket[1m]))) >= 0.02
)[10m:1m]) > bool 0)
) >= 2
"
Response body:
{"status":"error","error":"Internal error during building DataFusion plan: No field named
inference_time_per_output_token_seconds_bucket.infr_svc_uid. Did you mean 'infr_svc_uid'?.","errorType":"PlanQuery"}
# Simplified reproduction via GreptimeDB Prometheus API (no subquery):
# POST /v1/prometheus/api/v1/query
# query=histogram_quantile(0.5, sum by (le, infr_svc_uid) (rate(inference_time_per_output_token_seconds_bucket[1m]))) >=
0.02
#
# Response:
# {"status":"error","error":"Internal error during building DataFusion plan: No field named
\"sum(prom_rate(time_range,value,time,Int64(60000)))\".", "errorType":"PlanQuery"}
What type of bug is this?
Unexpected error
What subsystems are affected?
Frontend
Minimal reproduce step
Description
Binary comparison operators (
>=,>,<, etc.) and arithmetic operators (+,-) applied tohistogram_quantile()results fail with a DataFusion internal error. The
histogram_quantile()function itself evaluates correctly, but anybinary operation on its output triggers a field resolution failure.
Environment
/v1/prometheus/api/v1/query)Steps to Reproduce
Execute the following PromQL queries in order:
Error Messages
When wrapped in a subquery (via vmalert):
Expected Behavior
histogram_quantile(...) >= threshold should return a boolean filtered time series, consistent with standard PromQL behavior
(as in Prometheus/VictoriaMetrics).
Actual Behavior
DataFusion fails to resolve the output field name of histogram_quantile() when used as input to a binary operator. The
internal field reference appears to retain the raw function call signature or the source table name as a prefix, rather than
resolving to the aggregated output.
Workaround
Using cumulative bucket ratios to approximate percentile threshold checks:
Use Case
We are using vmalert to evaluate SLA alerting rules against GreptimeDB. The rules need to check if latency percentiles (p50,
p75, p90, p99) exceed defined thresholds, which requires binary comparisons on hist
What did you expect to see?
histogram_quantile(0.5, sum by (le, infr_svc_uid) (rate(inference_time_per_output_token_seconds_bucket[1m]))) >= 0.02should return a boolean filtered time series (samples where p50 >= 0.02s), consistent with standard PromQL behavior in
Prometheus and VictoriaMetrics.
What did you see instead?
DataFusion internal error when applying any binary operator (>=, >, +, -) to histogram_quantile() output:
Note: histogram_quantile(...) alone evaluates correctly. The error only occurs when its result is used as input to a binary
operator.
What operating system did you use?
GreptimeDB v1.0.0-beta.4 deployed on Kubernetes (Linux amd64)
What version of GreptimeDB did you use?
v1.0.0-beta.4
Relevant log output and stack trace