Skip to content

Commit 4eefd5d

Browse files
lukekimewgenius
andauthored
fix(spicepod): JSON schema accepts string or {name: expr} for partition_by (spiceai#10352)
* fix(spicepod): schema now accepts string or {name: expr} for partition_by The custom partition_by deserializer accepts either a plain expression string (e.g. "bucket(10, user_id)") or a single-entry object mapping a name to an expression (e.g. {year: "YEAR(created_at)"}), but the generated JSON schema derived from the PartitionedBy struct only accepted objects with explicit 'name' and 'expression' fields. This caused editors (yaml-schema) to report 'Incorrect type. Expected PartitionedBy' for valid spicepod configs. Add a schema-only PartitionedBySchema helper describing both accepted shapes and apply schemars(with = ...) on the acceleration and vector partition_by fields, then regenerate .schema/spicepod.schema.json. * fix(spicepod): schema accepts serde aliases (datasets.mode, eviction_policy) Serde '#[serde(alias = "...")]' attributes are not reflected in the JSON schema generated by schemars, so spicepod configs that used an accepted alias (e.g. 'datasets[].mode' instead of 'datasets[].access') were flagged as unknown properties by editors when the containing struct uses 'deny_unknown_fields'. Add a small post-processing pass in the spicepodschema tool that walks the generated schema and, for each known canonical field, injects an alias property pointing at the same schema value. Currently covers: - datasets[].mode -> datasets[].access - eviction_policy -> caching_policy (cache configs) Regenerate .schema/spicepod.schema.json. * fix(spicepod): constrain partition_by named mapping to single entry Address review feedback: the Named variant of PartitionedBySchema now sets minProperties=1 / maxProperties=1 so editors flag invalid multi-entry mappings instead of silently accepting them. * fix(spicepod): update acceleration parameters to sort and deduplicate accelerator names * Update docs comments for partition_by, to be correctly generated in spicepod json schema * remove schema-only descriptions * keep only valida partition_by examples * unify partition_by commebts * fix(spicepod): address PR review feedback for partition_by schema - Tighten deserialize_partition_by to error on multi-entry maps, non-string values, and unsupported scalar items instead of silently accepting them - Add serialize_with = "serialize_partition_by" to VectorStore.partition_by so serialization matches the accepted YAML shapes - Update partition_by doc comments on Acceleration and VectorStore to describe both accepted shapes (string | single-entry map) - Sort and dedup data connectors and accelerators at collection time for deterministic schema output (removes duplicates like duckdb/arrow and stabilizes x-spice-connectors ordering) - Add unit tests for inject_field_aliases covering match, mismatch, no-overwrite, recursive injection, and no-op cases - Add negative tests for deserialize_partition_by rejecting invalid shapes - Regenerate .schema/spicepod.schema.json * refactor(tests): simplify error handling and improve readability in partition_by tests * refactor(collector): improve sorting and deduplication logic for connector registrations --------- Co-authored-by: ewgenius <hey@ewgenius.me>
1 parent ccf11a1 commit 4eefd5d

7 files changed

Lines changed: 2088 additions & 1710 deletions

File tree

0 commit comments

Comments
 (0)