Skip to content

UDF Checkpoints cleanup#1454

Closed
ilongin wants to merge 188 commits intomainfrom
ilongin/1453-checkpoint-udf-cleanup
Closed

UDF Checkpoints cleanup#1454
ilongin wants to merge 188 commits intomainfrom
ilongin/1453-checkpoint-udf-cleanup

Conversation

@ilongin
Copy link
Contributor

@ilongin ilongin commented Nov 9, 2025

Implements checkpoint cleanup with TTL-based garbage collection:

  • cleanup_checkpoints() method in Catalog to remove old checkpoints
  • _remove_checkpoint() helper to clean checkpoint and its UDF tables
  • list_checkpoints() enhancement to support time-based filtering
  • get_descendant_job_ids() to find child jobs for dependency checking
  • remove_checkpoint() in metastore to delete checkpoint records
  • CLI integration in garbage_collect command
  • Comprehensive tests for cleanup scenarios

This builds on top of the core checkpoint functionality to provide automatic cleanup of outdated checkpoints and their associated tables.

Related to #1453

Summary by Sourcery

Add TTL-based garbage collection for UDF checkpoints to prevent storage bloat while preserving dependencies.

New Features:

  • Implement Catalog.cleanup_checkpoints to automatically remove outdated checkpoints and their UDF tables based on a configurable TTL with branch pruning logic.
  • Add Catalog._remove_checkpoint helper to delete individual checkpoints and associated job-specific UDF tables.
  • Extend metastore with get_descendant_job_ids, remove_checkpoint, and time-based list_checkpoints filters (created_before and created_after).
  • Integrate checkpoint cleanup into the CLI garbage_collect command.

Tests:

  • Add functional tests for default TTL cleanup, custom TTL, no-op when no old checkpoints exist, preservation of checkpoints with active descendants, partial job cleanup within a single job, full branch pruning of outdated lineages, and descendant job ID retrieval.

Base automatically changed from ilongin/1392-udf-checkpoints to main February 15, 2026 03:06
@ilongin
Copy link
Contributor Author

ilongin commented Feb 16, 2026

closed as I openeed new one (it was easier to handle conflicts that way after rebasing)

@ilongin ilongin closed this Feb 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

UDF Checkpoints cleanup

1 participant