[pull] trunk from spiceai:trunk by pull[bot] · Pull Request #38 · TheRakeshPurohit/spiceai

pull · 2025-04-21T03:06:11Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

* Test spice chat in e2e_test_spice_cli * remove temporary changes

…piceai#10831) * feat(task-history): capture distributed query observability For distributed Ballista queries, the task_history table now records the full parent + per-stage tree: sql_query (parent) duration, error, datasets, summary labels └── ballista_stage (child) stage_id, executors, task_count, stage timing, plan tree Previously the task_history span was created in submit_distributed_internal but never instrumented across the job's lifetime — it dropped at submission time with duration ≈ 0 and no error_message. Stage/executor detail was not recorded at all. Implementation: - QueryHandle owns the sql_query span behind Arc<Mutex<Option<Span>>>. spawn_finalize *takes* it (rather than cloning), so the OTel span closes when the spawned finalize future drops its instrumented clone — capturing the query's runtime, not arbitrary post-completion handle lifetime. - New stage_history module walks the in-process ExecutionGraph at job completion and emits one child ballista_stage span per stage via the existing OTel pipeline. Stage labels include executor_histogram, slowest_task_ms, partitions, attempt_num, total_executor_ms. Stage input is the plan rendered with ExplainFormat::Tree. - QueryHandle::cancel() now finalizes the tracker with JobCancelled before returning, so callers that cancel() then drop the handle record an accurate cancellation row instead of the Drop guard's client-disconnected fallback. - Drop guard finalizes orphaned handles: cancels the cancel_token sync, then spawns a finalize task that also calls scheduler.cancel_job so executors don't keep running an unobserved query. - Errors raised before QueryHandle creation (planning/validation/submission) emit tracing::error on the parent span so the row's error_message is populated (mirrors the sync run_internal path). Tests use a process-wide AsyncMutex (TEST_LOCK) to serialize against shared DF_SLOT, condition-driven wait_for_row_count polling instead of fixed sleeps, and parent.job_id-scoped assertions so other tests' task_history rows in the same binary can't pollute our row counts. Requires the matching Ballista fork PR (#38, merged onto spiceai-52.5 as 8afc1b74a605) which persists executor_id on terminal TaskInfo and bumps get_job_execution_graph to pub. * Bump ballista to pick up stuck-query detection (spiceai/datafusion-ballista#39) Updates the ballista pin from 8afc1b74 to 7e9872a5 (the merge commit of spiceai/datafusion-ballista#39) so the next cluster bench run picks up the new operator-facing warning when a distributed query stops making progress. Surgical lockfile update: only the three ballista source URLs change. Vortex stays pinned at the spiceai-52 commit (c536c9ae) the workspace was already using; without this, running cargo update -p ballista-* re-resolves the branch-tracked vortex transitive and bumps it to a newer rev whose arrow-rs version conflicts with the pinned datafusion-table-providers (CI symptom: no method as_list on dyn Array, Arc<Schema>: From<Schema> not satisfied, in adbc compilation). Verified: cargo check -p runtime --offline and cargo check -p spiced --features adbc --offline both clean. * feat(task-history): pluggable span middleware + Ballista stage timeline rendering Introduces two trait-based extension points on the task_history OTel exporter and removes the hardcoded `ballista_stage` branches that lived in the exporter: - `SpanTransform`: mutates a `SpanData` before conversion to a row. Implementors can adjust timestamps, inject attributes, redact fields, etc. - `SpanRetention`: declares a retention dependency between sibling spans in a batch. Returning `Some(parent_span_id)` from `parent_dependency(span)` means "keep this span iff that span is kept by base rules" — expressing the parent/child relationship directly instead of hardcoding it. `TaskHistoryExporter` now stores `Vec<Arc<dyn SpanTransform>>` and `Vec<Arc<dyn SpanRetention>>` with `.with_transform` / `.with_retention` builder methods. The exporter no longer mentions any specific span type — base retention checks PLAN_CAPTURE_LABEL and `min_sql_duration_ms`, retention rules then override per-span. Concrete consumer: `BallistaStageMiddleware` in `crates/runtime/src/datafusion/query/stage_history.rs` implements both traits: - As a `SpanTransform`, it rewrites `start_time` / `end_time` on `ballista_stage` spans using the `stage_started_at` / `stage_ended_at` attributes (millis since UNIX epoch), so the task_history row reflects the actual stage execution window. This is what makes per-stage execution visible on the timeline view. - As a `SpanRetention`, it declares that a `ballista_stage` row depends on its parent `sql_query` row, replacing the prior "if task.as_ref() == ballista_stage" branch in the exporter. `BallistaStageMiddleware::pair()` returns a single instance reused as both trait objects so callers register one Arc in two slots; the three exporter construction sites (production tracing, generic test util, distributed_task_history test) all use it. Includes unit tests covering both hooks: time override on the matching span name, no-op on non-matching names and missing/inverted/zero attributes; retention reports the parent dependency for stage spans and abstains for non-stage spans and orphan stages. * feat(observability): propagate request trace context across distributed query boundaries (spiceai#10896) Issue spiceai#10202. Distributed query executors no longer create a fresh internal request context; they inherit the originating request's protocol and W3C trace ids so executor-side metrics, telemetry dimensions, and any future task_history rows correlate end-to-end with the scheduler. * New `SpiceRequestContextConfig` (`spice_ctx` prefix) carries `protocol`, `trace_id`, and `span_id` through DataFusion's `ConfigExtension` mechanism — round-tripped opaquely by Ballista as `TaskDefinition` config props, with zero fork changes. * Registered as a default option extension at all 3 config_producer / session_builder sites in `runtime/src/cluster/mod.rs`. The scheduler session_builder now reads any `SpiceRequestContextConfig` set on the per-job session config and re-injects it on the built session config (it previously ignored its `_cfg` argument). * `Query::submit_distributed_internal` populates the extension from the current `RequestContext` before `create_or_update_session`. * `resolve_request_context(TaskContext)` shared helper establishes the canonical lookup order: typed `Arc<RequestContext>` extension → config extension → `Protocol::Internal` fallback (or `None`). * `BytesProcessedExec::execute` switched to the helper. Same panic behavior preserved when no context is found and `fallback_to_new_context` is off. * `FlightSqlExec` gains an optional `trace_parent: Option<String>` field with a `with_trace_parent` builder. `execute()` sets it as a `traceparent` gRPC metadata header (via `FlightSqlServiceClient::set_header`), falling back to formatting from the typed `RequestContext` extension on the `TaskContext` session config when not preset. `PartialAggregationFlightSqlExec` inherits and forwards on its own `execute()`. * W3C span semantics preserved: the propagated `span_id` is the sender's current span and becomes the parent of any new spans the receiver creates — task_history's existing `parent_span_id` chain extends naturally one level per executor hop. Tests: 4 config extension roundtrip + 4 resolver lookup-order + 1 `BytesProcessedExec` integration test asserting metric dimensions carry the correct protocol when only the config extension is present. * chore(deps): bump datafusion-ballista to c25e25b9 (spiceai-52.5) * fix(flightsql): propagate traceparent in pushed-down aggregate exec PartialAggregationFlightSqlExec was sending no traceparent header when its source FlightSqlExec was built by the aggregate-pushdown optimizer (which leaves trace_parent unset). Fall back to TaskContext like the non-pushed FlightSqlExec path does, so pushed-down aggregate calls preserve the same propagation behavior. Also picks up fmt drift in two trace-context files merged via spiceai#10896.

* fix: Update tpch benchmark snapshots for federated/file[parquet].yaml * fix: Update tpch benchmark snapshots for federated/mssql.yaml * fix: Update tpch benchmark snapshots for federated/glue[csv].yaml * fix: Update tpch benchmark snapshots for federated/oracle.yaml * fix: Update tpch benchmark snapshots for federated/iceberg[catalog].yaml * fix: Update tpch benchmark snapshots for federated/abfs_standard_versioned[parquet].yaml * fix: Update tpch benchmark snapshots for federated/spicecloud[catalog].yaml * fix: Update tpch benchmark snapshots for federated/iceberg[hadoop].yaml * fix: Update tpch benchmark snapshots for federated/mssql[catalog].yaml * fix: Update tpch benchmark snapshots for federated/glue[parquet].yaml * fix: Update tpch benchmark snapshots for federated/dynamodb.yaml * fix: Update tpch benchmark snapshots for federated/glue[catalog].yaml * fix: Update tpch benchmark snapshots for federated/s3[parquet].yaml * fix: Update tpch benchmark snapshots for federated/oracle[catalog].yaml * fix: Update tpch benchmark snapshots for accelerated/s3[parquet]-duckdb[file]-partitioned.yaml * fix: Update tpch benchmark snapshots for accelerated/file[parquet]-arrow.yaml * fix: Update tpch benchmark snapshots for accelerated/dynamodb-arrow.yaml * fix: Update tpch benchmark snapshots for federated/scylladb.yaml * fix: Update tpch benchmark snapshots for accelerated/indexes/file[parquet]-cayenne[file]-indexes.yaml * fix: Update tpch benchmark snapshots for accelerated/indexes/file[parquet]-arrow-indexes.yaml * fix: Update tpch benchmark snapshots for accelerated/file[parquet]-cayenne[file]turso.yaml * fix: Update tpch benchmark snapshots for accelerated/s3[parquet]-arrow-partitioned.yaml * fix: Update tpch benchmark snapshots for accelerated/dynamodb-duckdb[file].yaml * fix: Update tpch benchmark snapshots for accelerated/on_zero_results/file[parquet]-duckdb[file]-on_zero_results.yaml * fix: Update tpch benchmark snapshots for accelerated/on_zero_results/file[parquet]-duckdb[memory]-on_zero_results.yaml * fix: Update tpch benchmark snapshots for accelerated/on_zero_results/file[parquet]-cayenne[file]-on_zero_results.yaml * fix: Update tpch benchmark snapshots for accelerated/s3[parquet]-cayenne[file]-partitioned.yaml * fix: Update tpch benchmark snapshots for accelerated/s3[parquet]-turso[file].yaml * fix: Update tpch benchmark snapshots for accelerated/indexes/file[parquet]-turso[file]-indexes.yaml * fix: Update tpch benchmark snapshots for accelerated/spicecloud-arrow.yaml * fix: Update tpch benchmark snapshots for accelerated/file[parquet]-cayenne[file].yaml * feat(task-history): record Ballista stages for distributed queries (spiceai#10831) * feat(task-history): capture distributed query observability For distributed Ballista queries, the task_history table now records the full parent + per-stage tree: sql_query (parent) duration, error, datasets, summary labels └── ballista_stage (child) stage_id, executors, task_count, stage timing, plan tree Previously the task_history span was created in submit_distributed_internal but never instrumented across the job's lifetime — it dropped at submission time with duration ≈ 0 and no error_message. Stage/executor detail was not recorded at all. Implementation: - QueryHandle owns the sql_query span behind Arc<Mutex<Option<Span>>>. spawn_finalize *takes* it (rather than cloning), so the OTel span closes when the spawned finalize future drops its instrumented clone — capturing the query's runtime, not arbitrary post-completion handle lifetime. - New stage_history module walks the in-process ExecutionGraph at job completion and emits one child ballista_stage span per stage via the existing OTel pipeline. Stage labels include executor_histogram, slowest_task_ms, partitions, attempt_num, total_executor_ms. Stage input is the plan rendered with ExplainFormat::Tree. - QueryHandle::cancel() now finalizes the tracker with JobCancelled before returning, so callers that cancel() then drop the handle record an accurate cancellation row instead of the Drop guard's client-disconnected fallback. - Drop guard finalizes orphaned handles: cancels the cancel_token sync, then spawns a finalize task that also calls scheduler.cancel_job so executors don't keep running an unobserved query. - Errors raised before QueryHandle creation (planning/validation/submission) emit tracing::error on the parent span so the row's error_message is populated (mirrors the sync run_internal path). Tests use a process-wide AsyncMutex (TEST_LOCK) to serialize against shared DF_SLOT, condition-driven wait_for_row_count polling instead of fixed sleeps, and parent.job_id-scoped assertions so other tests' task_history rows in the same binary can't pollute our row counts. Requires the matching Ballista fork PR (#38, merged onto spiceai-52.5 as 8afc1b74a605) which persists executor_id on terminal TaskInfo and bumps get_job_execution_graph to pub. * Bump ballista to pick up stuck-query detection (spiceai/datafusion-ballista#39) Updates the ballista pin from 8afc1b74 to 7e9872a5 (the merge commit of spiceai/datafusion-ballista#39) so the next cluster bench run picks up the new operator-facing warning when a distributed query stops making progress. Surgical lockfile update: only the three ballista source URLs change. Vortex stays pinned at the spiceai-52 commit (c536c9ae) the workspace was already using; without this, running cargo update -p ballista-* re-resolves the branch-tracked vortex transitive and bumps it to a newer rev whose arrow-rs version conflicts with the pinned datafusion-table-providers (CI symptom: no method as_list on dyn Array, Arc<Schema>: From<Schema> not satisfied, in adbc compilation). Verified: cargo check -p runtime --offline and cargo check -p spiced --features adbc --offline both clean. * feat(task-history): pluggable span middleware + Ballista stage timeline rendering Introduces two trait-based extension points on the task_history OTel exporter and removes the hardcoded `ballista_stage` branches that lived in the exporter: - `SpanTransform`: mutates a `SpanData` before conversion to a row. Implementors can adjust timestamps, inject attributes, redact fields, etc. - `SpanRetention`: declares a retention dependency between sibling spans in a batch. Returning `Some(parent_span_id)` from `parent_dependency(span)` means "keep this span iff that span is kept by base rules" — expressing the parent/child relationship directly instead of hardcoding it. `TaskHistoryExporter` now stores `Vec<Arc<dyn SpanTransform>>` and `Vec<Arc<dyn SpanRetention>>` with `.with_transform` / `.with_retention` builder methods. The exporter no longer mentions any specific span type — base retention checks PLAN_CAPTURE_LABEL and `min_sql_duration_ms`, retention rules then override per-span. Concrete consumer: `BallistaStageMiddleware` in `crates/runtime/src/datafusion/query/stage_history.rs` implements both traits: - As a `SpanTransform`, it rewrites `start_time` / `end_time` on `ballista_stage` spans using the `stage_started_at` / `stage_ended_at` attributes (millis since UNIX epoch), so the task_history row reflects the actual stage execution window. This is what makes per-stage execution visible on the timeline view. - As a `SpanRetention`, it declares that a `ballista_stage` row depends on its parent `sql_query` row, replacing the prior "if task.as_ref() == ballista_stage" branch in the exporter. `BallistaStageMiddleware::pair()` returns a single instance reused as both trait objects so callers register one Arc in two slots; the three exporter construction sites (production tracing, generic test util, distributed_task_history test) all use it. Includes unit tests covering both hooks: time override on the matching span name, no-op on non-matching names and missing/inverted/zero attributes; retention reports the parent dependency for stage spans and abstains for non-stage spans and orphan stages. * feat(observability): propagate request trace context across distributed query boundaries (spiceai#10896) Issue spiceai#10202. Distributed query executors no longer create a fresh internal request context; they inherit the originating request's protocol and W3C trace ids so executor-side metrics, telemetry dimensions, and any future task_history rows correlate end-to-end with the scheduler. * New `SpiceRequestContextConfig` (`spice_ctx` prefix) carries `protocol`, `trace_id`, and `span_id` through DataFusion's `ConfigExtension` mechanism — round-tripped opaquely by Ballista as `TaskDefinition` config props, with zero fork changes. * Registered as a default option extension at all 3 config_producer / session_builder sites in `runtime/src/cluster/mod.rs`. The scheduler session_builder now reads any `SpiceRequestContextConfig` set on the per-job session config and re-injects it on the built session config (it previously ignored its `_cfg` argument). * `Query::submit_distributed_internal` populates the extension from the current `RequestContext` before `create_or_update_session`. * `resolve_request_context(TaskContext)` shared helper establishes the canonical lookup order: typed `Arc<RequestContext>` extension → config extension → `Protocol::Internal` fallback (or `None`). * `BytesProcessedExec::execute` switched to the helper. Same panic behavior preserved when no context is found and `fallback_to_new_context` is off. * `FlightSqlExec` gains an optional `trace_parent: Option<String>` field with a `with_trace_parent` builder. `execute()` sets it as a `traceparent` gRPC metadata header (via `FlightSqlServiceClient::set_header`), falling back to formatting from the typed `RequestContext` extension on the `TaskContext` session config when not preset. `PartialAggregationFlightSqlExec` inherits and forwards on its own `execute()`. * W3C span semantics preserved: the propagated `span_id` is the sender's current span and becomes the parent of any new spans the receiver creates — task_history's existing `parent_span_id` chain extends naturally one level per executor hop. Tests: 4 config extension roundtrip + 4 resolver lookup-order + 1 `BytesProcessedExec` integration test asserting metric dimensions carry the correct protocol when only the config extension is present. * chore(deps): bump datafusion-ballista to c25e25b9 (spiceai-52.5) * fix(flightsql): propagate traceparent in pushed-down aggregate exec PartialAggregationFlightSqlExec was sending no traceparent header when its source FlightSqlExec was built by the aggregate-pushdown optimizer (which leaves trace_parent unset). Fall back to TaskContext like the non-pushed FlightSqlExec path does, so pushed-down aggregate calls preserve the same propagation behavior. Also picks up fmt drift in two trace-context files merged via spiceai#10896. * Revert "fix(tests): stabilize flaky SQL search snapshots with score normalization (spiceai#10585)" (spiceai#10891) This reverts commit efc8d00. * Add '#[deny(clippy::missing_trait_methods)]' to wrapper/delegation trait impls (spiceai#10795) * Add '#[deny(clippy::missing_trait_methods)]' to wrapper/delegation trait impls * implement missing * fix: implement missing delegation methods caught by missing_trait_methods lint * fix: rustfmt get_logical_plan signature in indexed.rs * fix: implement missing TableProvider delegation methods in PartitionTableProvider * fix: correct ScanResult return type and IndexedMemTable::scan_with_args delegation - runtime-table-partition/src/provider.rs:490: use datafusion::error::Result instead of bare Result<T> (no type alias in scope; std::result::Result requires two generic args, causing E0107 compile error on all CI jobs) - data_components/src/arrow/indexed.rs: unpack ScanArgs and call self.scan() instead of forwarding to self.inner.scan_with_args(), preserving the index fast-path logic that IndexedMemTable::scan implements * fix: replace redundant closure with method reference (clippy pedantic) * fix: replace redundant closure with method reference in scan_with_args * fix: add missing trait method forwarding impls for #[deny(clippy::missing_trait_methods)] Adds explicit implementations for all provided/default trait methods across wrapper and stub types so the deny attribute compiles cleanly: TableProvider wrappers (delegate to inner): - AcceleratedTable: get_table_definition, get_logical_plan, get_column_default, scan_with_args, statistics, truncate - UpsertDedupTableProvider: same set - EmbeddingTable: scan_with_args, truncate - LocationPruningListingTable: get_logical_plan, get_column_default, scan_with_args, statistics, insert_into, delete_from TableProvider with no inner (None / NotImplemented stubs): - SwappableTableProvider: get_table_definition/get_logical_plan/get_column_default return None (can't borrow from current() temporary Arc); scan_with_args and truncate delegate via current() - DatasetTableProvider: all missing methods return Internal errors or None - RerankUDTFProvider: metadata returns None; DML returns NotImplemented - ReciprocalRankFusion: same pattern CatalogProvider wrapper: - RefreshingCatalogProvider: register_schema, deregister_schema forwarded to inner DataConnector wrappers: - DeferredConnector: all missing methods added; stream methods return false/None (connector is a pre-initialization placeholder); setup/metrics methods delegate to self.inner - EmbeddingConnector: resolve_refresh_mode, initialization_for_dataset - FullTextConnector: initialization_for_dataset Embed wrapper: - TaskEmbed: cache, get_cached_embed, put_cached_embed delegated to self.inner * fix: remove #[deny(clippy::missing_trait_methods)] from AcceleratedTable, DeferredConnector, DatasetTableProvider * linting * fix: implement missing TableProvider trait methods for LocationPruningListingTable and EmbeddingTable - Add and to LocationPruningListingTable, delegating to inner - Add and to EmbeddingTable, delegating to base_table - Add missing imports: std::borrow::Cow and datafusion::logical_expr::LogicalPlan * fix AcceleratedTable * comments * clippy * remove nasa --------- Co-authored-by: Jeadie <jeadie@users.noreply.github.com> * fix: Update tpcds benchmark snapshots for federated/file[parquet].yaml * fix: Update tpcds benchmark snapshots for federated/abfs[parquet].yaml * fix: Update tpcds benchmark snapshots for federated/databricks[delta_lake].yaml * fix: Update tpcds benchmark snapshots for federated/s3[parquet].yaml * fix: Update tpcds benchmark snapshots for accelerated/file[parquet]-arrow.yaml * fix(cayenne): fix Vortex panic on highly compressible data (spiceai#10855) * fix(cayenne): fix Vortex panic on highly compressible columns * Update SHA * Add test * Add test * Fix lint --------- Co-authored-by: Luke Kim <80174+lukekim@users.noreply.github.com> * fix: Update tpcds benchmark snapshots for accelerated/s3[parquet]-arrow-partitioned.yaml * fix: Update tpcds benchmark snapshots for accelerated/on_zero_results/file[parquet]-duckdb[file]-on_zero_results.yaml * fix: Update tpcds benchmark snapshots for accelerated/on_zero_results/file[parquet]-duckdb[memory]-on_zero_results.yaml * fix: Update tpcds benchmark snapshots for accelerated/on_zero_results/file[parquet]-cayenne[file]-on_zero_results.yaml * fix: Update tpcds benchmark snapshots for accelerated/databricks[delta_lake]-arrow.yaml * fix: Update tpcds benchmark snapshots for accelerated/s3[parquet]-cayenne[file].yaml * fix: Update tpcds benchmark snapshots for accelerated/spicecloud-arrow.yaml * fix: Update tpcds benchmark snapshots for accelerated/file[parquet]-cayenne[file].yaml * fix: Update tpcds benchmark snapshots for accelerated/s3[parquet]-arrow.yaml * fix: Update tpcds benchmark snapshots for accelerated/postgres-arrow.yaml * fix: Update clickbench benchmark snapshots for accelerated/s3[parquet]-turso[file].yaml * fix: Update tpch benchmark snapshots for accelerated/s3[parquet]-arrow.yaml * fix: Update tpch benchmark snapshots for accelerated/s3[parquet]-cayenne[file].yaml * fix: Update tpcds benchmark snapshots for accelerated/s3[parquet]-duckdb[file].yaml * fix: Update tpch benchmark snapshots for accelerated/mysql-arrow.yaml * fix: Update tpch benchmark snapshots for accelerated/postgres-arrow.yaml * Update cargo.lock * fix: Update tpch benchmark snapshots for accelerated/mongodb-arrow.yaml * fix: Update tpch benchmark snapshots for federated/mongodb.yaml --------- Co-authored-by: Spice Benchmark Snapshot Update Bot <spiceaibot@spice.ai> Co-authored-by: Phillip LeBlanc <879445+phillipleblanc@users.noreply.github.com> Co-authored-by: Jack Eadie <jack@spice.ai> Co-authored-by: Jeadie <jeadie@users.noreply.github.com> Co-authored-by: Sergei Grebnov <sergei.grebnov@gmail.com> Co-authored-by: Luke Kim <80174+lukekim@users.noreply.github.com>

sgrebnov and others added 2 commits April 20, 2025 23:12

Upgrade DataFusion table providers (#5443)

89b251f

Test spice chat in e2e_test_spice_cli (#5447)

3db8832

* Test spice chat in e2e_test_spice_cli * remove temporary changes

pull Bot added the ⤵️ pull label Apr 21, 2025

pull Bot merged commit 3db8832 into TheRakeshPurohit:trunk Apr 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] trunk from spiceai:trunk#38

[pull] trunk from spiceai:trunk#38
pull[bot] merged 2 commits into
TheRakeshPurohit:trunkfrom
spiceai:trunk

pull Bot commented Apr 21, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pull Bot commented Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pull Bot commented Apr 21, 2025 •

edited

Loading