[pull] trunk from spiceai:trunk#751
Merged
Merged
Conversation
…ion_by` (#10352) * fix(spicepod): schema now accepts string or {name: expr} for partition_by The custom partition_by deserializer accepts either a plain expression string (e.g. "bucket(10, user_id)") or a single-entry object mapping a name to an expression (e.g. {year: "YEAR(created_at)"}), but the generated JSON schema derived from the PartitionedBy struct only accepted objects with explicit 'name' and 'expression' fields. This caused editors (yaml-schema) to report 'Incorrect type. Expected PartitionedBy' for valid spicepod configs. Add a schema-only PartitionedBySchema helper describing both accepted shapes and apply schemars(with = ...) on the acceleration and vector partition_by fields, then regenerate .schema/spicepod.schema.json. * fix(spicepod): schema accepts serde aliases (datasets.mode, eviction_policy) Serde '#[serde(alias = "...")]' attributes are not reflected in the JSON schema generated by schemars, so spicepod configs that used an accepted alias (e.g. 'datasets[].mode' instead of 'datasets[].access') were flagged as unknown properties by editors when the containing struct uses 'deny_unknown_fields'. Add a small post-processing pass in the spicepodschema tool that walks the generated schema and, for each known canonical field, injects an alias property pointing at the same schema value. Currently covers: - datasets[].mode -> datasets[].access - eviction_policy -> caching_policy (cache configs) Regenerate .schema/spicepod.schema.json. * fix(spicepod): constrain partition_by named mapping to single entry Address review feedback: the Named variant of PartitionedBySchema now sets minProperties=1 / maxProperties=1 so editors flag invalid multi-entry mappings instead of silently accepting them. * fix(spicepod): update acceleration parameters to sort and deduplicate accelerator names * Update docs comments for partition_by, to be correctly generated in spicepod json schema * remove schema-only descriptions * keep only valida partition_by examples * unify partition_by commebts * fix(spicepod): address PR review feedback for partition_by schema - Tighten deserialize_partition_by to error on multi-entry maps, non-string values, and unsupported scalar items instead of silently accepting them - Add serialize_with = "serialize_partition_by" to VectorStore.partition_by so serialization matches the accepted YAML shapes - Update partition_by doc comments on Acceleration and VectorStore to describe both accepted shapes (string | single-entry map) - Sort and dedup data connectors and accelerators at collection time for deterministic schema output (removes duplicates like duckdb/arrow and stabilizes x-spice-connectors ordering) - Add unit tests for inject_field_aliases covering match, mismatch, no-overwrite, recursive injection, and no-op cases - Add negative tests for deserialize_partition_by rejecting invalid shapes - Regenerate .schema/spicepod.schema.json * refactor(tests): simplify error handling and improve readability in partition_by tests * refactor(collector): improve sorting and deduplication logic for connector registrations --------- Co-authored-by: ewgenius <hey@ewgenius.me>
…ixes #9872) (#10360) The TursoBetweenVisitor rewrote numeric BETWEEN expressions to CAST(... AS REAL) comparisons, but this didn't fix the TPCH Q6 data correctness bug. The root cause is float arithmetic precision: 0.06 + 0.01 in float64 = 0.06999... which is less than the stored 0.07, so l_discount = 0.07 rows were incorrectly excluded from the BETWEEN filter. ROUND(..., 10) normalizes both sides to 10 decimal places, which eliminates the float arithmetic edge cases. ROUND(0.06 + 0.01, 10) evaluates to the same float64 as ROUND(0.07, 10), so the comparison now returns the correct result. Co-authored-by: Luke Kim <80174+lukekim@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )