Skip to content

feat: dynamic partition conversion#57

Draft
ykrevnyi wants to merge 2 commits intothanos-io:mainfrom
ykrevnyi:feature/dynamic-partitioning
Draft

feat: dynamic partition conversion#57
ykrevnyi wants to merge 2 commits intothanos-io:mainfrom
ykrevnyi:feature/dynamic-partitioning

Conversation

@ykrevnyi
Copy link
Copy Markdown

Two-loop Partition Conversion

This MR refactors the converter to support flexible sub-daily partitions alongside daily historical conversion, enabling fresher data availability for recent metrics.

partition

Key Changes

  • Dual conversion loops: Separate loops for historical (daily) and partition (sub-daily) conversion that run independently
  • Configurable partition duration: Sub-daily partitions (e.g., 2h) for fresher recent data
  • New partition path format: 2025/01/19/parts/00-02 for sub-daily, 2025/01/19 for daily
  • Automatic cleanup: Partitions are superseded by daily blocks and cleaned up from memory and disk

Partition Loop

Handles recent data within the lookback window (default 24h from now).

Behavior:

  • Converts TSDB blocks into sub-daily partition files (e.g., 2h chunks)
  • Only processes days that don't yet have daily parquet blocks
  • Runs more frequently (default 30m) for fresher data availability
  • Path format: YYYY/MM/DD/parts/HH-HH (e.g., 2025/01/19/parts/00-02)

Flags:

Flag Default Description
--convert.partition.run-interval 30m How often to run
--convert.partition.run-timeout 2h Max duration per cycle
--convert.partition.duration 2h Size of each partition
--convert.partition.lookback 24h Window from now to consider
--convert.partition.max-steps 1 Max partitions per cycle

Historical Loop

Handles completed days that are fully in the past.

Behavior:

  • Converts TSDB blocks into daily (24h) parquet files
  • Runs less frequently (default 1h) since data is not time-sensitive
  • Path format: YYYY/MM/DD (e.g., 2025/01/19)

Flags:

Flag Default Description
--convert.historical.run-interval 1h How often to run
--convert.historical.run-timeout 24h Max duration per cycle
--convert.historical.max-steps 1 Max days per cycle

Running Modes

The --convert.mode flag controls which loops run:

Mode Description
both (default) Runs both partition and historical loops concurrently
partition Only runs the partition loop (fresh data only)
historical Only runs the historical loop (daily blocks only)

Each mode creates separate discoverers to avoid concurrent map access issues.


Serve

Query-time filtering prevents duplicate data when both partitions and daily blocks exist:

  • filterOverlappingPartitions() removes partition blocks from queries if a daily block exists for the same date
  • Example: If both 2025/01/19 and 2025/01/19/parts/00-02 exist, only the daily block is queried

Cleanup

Once a daily block is created, partitions for that day become redundant. Three layers ensure they're cleaned up:

Layer Where When What
Bucket S3/GCS After historical conversion Deletes partition files from object storage
Syncer Serve pod memory + disk On sync interval Evicts blocks from memory, deletes local label files
Query-time Query execution Every query Filters out partitions if daily block exists (fallback)

Bucket Cleanup

Runs after historical conversion via cleanupStalePartitions(). Scans for partitions that have a corresponding daily block and deletes all files under YYYY/MM/DD/parts/. Only cleans days before yesterday (24-48h safety window) to ensure serve pods have discovered the daily block first.

Syncer Cleanup

Runs on each sync cycle in serve pods. Detects partitions superseded by daily blocks, removes them from the in-memory block map, and deletes cached label files from local disk. Without this, serve pods would accumulate stale block references and orphaned label files, wasting memory and disk.

Query-time Filtering

filterOverlappingPartitions() runs at query time as a safety net. Excludes partition blocks if a daily block exists for the same date, preventing duplicate data during the transition window. This handles the race condition between daily block creation and syncer discovery, since different pods sync at different times.


Latest Evaluation

Partition duration: 2h
Run interval: 30m

Converted Partitions (2026/02/10)

2026/02/10/parts/00-02  ✓
2026/02/10/parts/02-04  ✓
2026/02/10/parts/04-06  ✓
2026/02/10/parts/06-08  ✓
2026/02/10/parts/08-10  ✓
2026/02/10/parts/10-12  ✓
2026/02/10/parts/12-14  ✓
2026/02/10/parts/14-16  ✓
2026/02/10/parts/16-18  ✓
2026/02/10/parts/18-20  ✓
2026/02/10/parts/20-22  ← converting
2026/02/10/parts/22-24  (pending, grace period)

@ykrevnyi ykrevnyi force-pushed the feature/dynamic-partitioning branch from baca0ca to c8f4f64 Compare February 11, 2026 19:24
@ykrevnyi ykrevnyi force-pushed the feature/dynamic-partitioning branch from c8f4f64 to 84c6d7d Compare February 11, 2026 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant