Version
Current main (0.3.x as of 2026-05-28)
Platform
Any — reproducible on all platforms
Description
NumberChunk::format_item hardcodes {:0>8} for block numbers in output filenames. Rust's string formatter does not truncate — at block 100,000,000 the padding silently overflows to 9 digits:
bsc_mainnet__transactions__99995000_to_99999999.parquet ← 8 digits ✓
bsc_mainnet__transactions__100000000_to_100004999.parquet ← 9 digits ✗
This breaks lexicographic sort. Any pipeline that sorts output files by name (S3 prefix listings, ls, glob) will process block 100M files before block 10M files.
BSC crossed block 100,000,000 in May 2026 and is currently affected. Polygon is approaching ~70M blocks and will be next.
Why changing the constant is wrong
Replacing {:0>8} with {:0>9} or {:0>16} is a different hardcoded limit — it breaks all existing users' filenames on upgrade and requires another migration when the next chain overflows.
Proposed fix
Add --block-number-pad-width <N> CLI argument. Default 8 preserves current behaviour exactly (zero breaking change for existing users). Operators on high-block-number chains opt in by passing --block-number-pad-width 9.
I have an implementation ready if the approach sounds good.
Version
Current main (0.3.x as of 2026-05-28)
Platform
Any — reproducible on all platforms
Description
NumberChunk::format_itemhardcodes{:0>8}for block numbers in output filenames. Rust's string formatter does not truncate — at block 100,000,000 the padding silently overflows to 9 digits:bsc_mainnet__transactions__99995000_to_99999999.parquet ← 8 digits ✓
bsc_mainnet__transactions__100000000_to_100004999.parquet ← 9 digits ✗
This breaks lexicographic sort. Any pipeline that sorts output files by name (S3 prefix listings,
ls,glob) will process block 100M files before block 10M files.BSC crossed block 100,000,000 in May 2026 and is currently affected. Polygon is approaching ~70M blocks and will be next.
Why changing the constant is wrong
Replacing
{:0>8}with{:0>9}or{:0>16}is a different hardcoded limit — it breaks all existing users' filenames on upgrade and requires another migration when the next chain overflows.Proposed fix
Add
--block-number-pad-width <N>CLI argument. Default8preserves current behaviour exactly (zero breaking change for existing users). Operators on high-block-number chains opt in by passing--block-number-pad-width 9.I have an implementation ready if the approach sounds good.