Skip to content

[pull] trunk from spiceai:trunk#751

Merged
pull[bot] merged 3 commits into
TheRakeshPurohit:trunkfrom
spiceai:trunk
Apr 18, 2026
Merged

[pull] trunk from spiceai:trunk#751
pull[bot] merged 3 commits into
TheRakeshPurohit:trunkfrom
spiceai:trunk

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Apr 18, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

lukekim and others added 3 commits April 18, 2026 11:16
…ion_by` (#10352)

* fix(spicepod): schema now accepts string or {name: expr} for partition_by

The custom partition_by deserializer accepts either a plain expression string
(e.g. "bucket(10, user_id)") or a single-entry object mapping a name to an
expression (e.g. {year: "YEAR(created_at)"}), but the generated JSON schema
derived from the PartitionedBy struct only accepted objects with explicit
'name' and 'expression' fields. This caused editors (yaml-schema) to report
'Incorrect type. Expected PartitionedBy' for valid spicepod configs.

Add a schema-only PartitionedBySchema helper describing both accepted shapes
and apply schemars(with = ...) on the acceleration and vector partition_by
fields, then regenerate .schema/spicepod.schema.json.

* fix(spicepod): schema accepts serde aliases (datasets.mode, eviction_policy)

Serde '#[serde(alias = "...")]' attributes are not reflected in the JSON
schema generated by schemars, so spicepod configs that used an accepted
alias (e.g. 'datasets[].mode' instead of 'datasets[].access') were flagged
as unknown properties by editors when the containing struct uses
'deny_unknown_fields'.

Add a small post-processing pass in the spicepodschema tool that walks the
generated schema and, for each known canonical field, injects an alias
property pointing at the same schema value. Currently covers:
- datasets[].mode -> datasets[].access
- eviction_policy -> caching_policy (cache configs)

Regenerate .schema/spicepod.schema.json.

* fix(spicepod): constrain partition_by named mapping to single entry

Address review feedback: the Named variant of PartitionedBySchema now sets
minProperties=1 / maxProperties=1 so editors flag invalid multi-entry
mappings instead of silently accepting them.

* fix(spicepod): update acceleration parameters to sort and deduplicate accelerator names

* Update docs comments for partition_by, to be correctly generated in
spicepod json schema

* remove schema-only descriptions

* keep only valida partition_by examples

* unify partition_by commebts

* fix(spicepod): address PR review feedback for partition_by schema

- Tighten deserialize_partition_by to error on multi-entry maps,
  non-string values, and unsupported scalar items instead of silently
  accepting them
- Add serialize_with = "serialize_partition_by" to VectorStore.partition_by
  so serialization matches the accepted YAML shapes
- Update partition_by doc comments on Acceleration and VectorStore to
  describe both accepted shapes (string | single-entry map)
- Sort and dedup data connectors and accelerators at collection time for
  deterministic schema output (removes duplicates like duckdb/arrow and
  stabilizes x-spice-connectors ordering)
- Add unit tests for inject_field_aliases covering match, mismatch,
  no-overwrite, recursive injection, and no-op cases
- Add negative tests for deserialize_partition_by rejecting invalid shapes
- Regenerate .schema/spicepod.schema.json

* refactor(tests): simplify error handling and improve readability in partition_by tests

* refactor(collector): improve sorting and deduplication logic for connector registrations

---------

Co-authored-by: ewgenius <hey@ewgenius.me>
…ixes #9872) (#10360)

The TursoBetweenVisitor rewrote numeric BETWEEN expressions to
CAST(... AS REAL) comparisons, but this didn't fix the TPCH Q6 data
correctness bug. The root cause is float arithmetic precision:
0.06 + 0.01 in float64 = 0.06999... which is less than the stored
0.07, so l_discount = 0.07 rows were incorrectly excluded from the
BETWEEN filter.

ROUND(..., 10) normalizes both sides to 10 decimal places, which
eliminates the float arithmetic edge cases. ROUND(0.06 + 0.01, 10)
evaluates to the same float64 as ROUND(0.07, 10), so the comparison
now returns the correct result.

Co-authored-by: Luke Kim <80174+lukekim@users.noreply.github.com>
@pull pull Bot locked and limited conversation to collaborators Apr 18, 2026
@pull pull Bot added the ⤵️ pull label Apr 18, 2026
@pull pull Bot merged commit cf2e98e into TheRakeshPurohit:trunk Apr 18, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants