Skip to content

[iris] Compress checkpoints with zstd, prune old ones, improve restart UX#4143

Merged
rjpower merged 1 commit intomainfrom
claude/trusting-proskuriakova
Mar 25, 2026
Merged

[iris] Compress checkpoints with zstd, prune old ones, improve restart UX#4143
rjpower merged 1 commit intomainfrom
claude/trusting-proskuriakova

Conversation

@rjpower
Copy link
Copy Markdown
Collaborator

@rjpower rjpower commented Mar 25, 2026

Checkpoint sqlite3 files are now compressed with zstandard (level 3) before
upload to remote storage. Downloads prefer .zst but fall back to uncompressed
files for backward compatibility with checkpoints written before this change.
Old checkpoints older than 3 days are pruned best-effort after each write.

The controller restart command gains --skip-checkpoint and --checkpoint-timeout
(default 5 minutes, was hardcoded 60s) flags, with progress feedback printed
before the RPC call.

…t UX

Checkpoint sqlite3 files are now compressed with zstandard (level 3) before
upload. Downloads prefer .zst but fall back to uncompressed for backward
compat. Old checkpoints (>3 days) are pruned best-effort after each write.
The restart command gains --skip-checkpoint and --checkpoint-timeout (default
5 min) flags with progress feedback.
@rjpower rjpower added the agent-generated Created by automation/agent label Mar 25, 2026
@rjpower rjpower requested a review from yonromai March 25, 2026 18:00
Copy link
Copy Markdown
Contributor

@yonromai yonromai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved.

I did not find a clear correctness regression in this change. The checkpoint format migration stays backward-compatible on restore, the pruning path is best-effort rather than on the critical path, and the focused checkpoint test file passed locally.

Residual risk is mostly around the new restart CLI flags being validated by static review rather than dedicated CLI tests in this PR.

Generated with Codex.

@rjpower rjpower merged commit 040c1f2 into main Mar 25, 2026
45 of 47 checks passed
@rjpower rjpower deleted the claude/trusting-proskuriakova branch March 25, 2026 18:09
Helw150 pushed a commit that referenced this pull request Apr 8, 2026
…t UX (#4143)

Checkpoint sqlite3 files are now compressed with zstandard (level 3)
before
upload to remote storage. Downloads prefer .zst but fall back to
uncompressed
files for backward compatibility with checkpoints written before this
change.
Old checkpoints older than 3 days are pruned best-effort after each
write.

The controller restart command gains --skip-checkpoint and
--checkpoint-timeout
(default 5 minutes, was hardcoded 60s) flags, with progress feedback
printed
before the RPC call.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-generated Created by automation/agent

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants