Skip to content

Commit d0f5887

Browse files
ravwojdylaclaude
andcommitted
Validate levanter_batch_size is positive
Reject batch_size < 1 in write_levanter_cache to prevent silent data loss (batch_size=0 would drop all records after the exemplar). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 9647757 commit d0f5887

1 file changed

Lines changed: 4 additions & 0 deletions

File tree

lib/zephyr/src/zephyr/writers.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -381,7 +381,11 @@ def write_levanter_cache(
381381
records: Tokenized records (iterable of dicts with array values)
382382
output_path: Path to output cache directory
383383
metadata: Metadata for the cache
384+
batch_size: Number of records to accumulate before flushing to disk.
384385
"""
386+
if batch_size < 1:
387+
raise ValueError(f"batch_size must be >= 1, got {batch_size}")
388+
385389
from levanter.store.cache import CacheMetadata, SerialCacheWriter
386390

387391
ensure_parent_dir(output_path)

0 commit comments

Comments
 (0)