Skip to content

Do not provide primary key constraints when no mutations#241

Draft
Jeadie wants to merge 21 commits into
trunkfrom
jeadie/26-04-02/no-constraints-on-events
Draft

Do not provide primary key constraints when no mutations#241
Jeadie wants to merge 21 commits into
trunkfrom
jeadie/26-04-02/no-constraints-on-events

Conversation

@Jeadie
Copy link
Copy Markdown
Contributor

@Jeadie Jeadie commented Apr 2, 2026

No description provided.

* feat: Add custom_image input to debug Spice Cloud workflow

Add a custom_image workflow input to run_spicebench_debug_spice_cloud
that allows specifying a custom runtime container image (e.g.
ghcr.io/spiceai/spiceai-dev:spicebench-sf10) instead of the default
nightly image.

When set, the image reference is parsed into registry, image name, and
tag components and passed through to spidapter as SPIDAPTER_IMAGE_REGISTRY,
SPIDAPTER_IMAGE_NAME, and SPIDAPTER_IMAGE_TAG env vars. The channel is
automatically switched to internal.

Also adds executor_memory_limit input and fixes NUM_QUERY_CLIENTS to
match the main workflow (2 instead of 8).

* fix: Run row count validation first in checkpoint validation

Move table row count validation to run as the first phase (Phase 0)
before the probe query. Row count queries are cheap SELECT COUNT(*)
and immediately surface data loss or duplication without waiting for
expensive analytical queries to converge.

* chore: Add per-batch operation row count logging for data reconciliation

Log insert/update/delete row counts for each batch written through
write_segments_for_batch. This covers both the initialization phase
and the main ETL run pipeline.

Example log output:
  INFO etl: Writing segments for batch table=customer batch_id=5
    segments=3 insert_rows=8192 update_rows=512 delete_rows=128

This allows post-hoc reconciliation: summing insert_rows - delete_rows
per table should match the expected row count at each checkpoint.
If there is a mismatch, the per-batch logs pinpoint which batch_id
had unexpected operation counts.
@Jeadie Jeadie self-assigned this Apr 2, 2026
@Jeadie Jeadie force-pushed the jeadie/26-04-02/no-constraints-on-events branch from 84389f4 to 1cd69d4 Compare April 2, 2026 21:34
@Jeadie Jeadie marked this pull request as draft April 2, 2026 21:47
@Jeadie
Copy link
Copy Markdown
Contributor Author

Jeadie commented Apr 2, 2026

Also contains SPIDAPTER_QUERY_MEMORY_LIMIT=500Gi for testing purposes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants