Skip to content

Archive dump filenames contain colons, breaking cross-platform compatibility and backup integrity #1867

@andreypfau

Description

@andreypfau

Summary

Current TON node releases generate archive dump filenames containing a colon (:) in shard identifiers.
Example:

packages/arch0000/archive.00000.0:8000000000000000.pack

This format is invalid or non-portable across common storage targets (Windows, S3, OCI volumes, NTFS-mounted containers). The issue breaks interoperability and violates standard filesystem constraints.

Environment

  • Affected components: validator-engine, all archive dump utilities
  • Version: any node version that introduced per-shard archive segmentation (starting from commit: b8999be)
  • Reproducible: always

Steps to Reproduce

  1. Start validator-engine
  2. Observe output filenames under the archive directory.
  3. Attempt to:
  • sync archives to S3 or GCS;
  • store under Windows NTFS, Apple File System (APFS) or CIFS mount;
  • restore using standard tooling (rsync, scp, backup scripts).

Actual Result

Filenames contain colons and are rejected or silently mangled:

  • : replaced with _ or %3A by tools depending on implementation;
  • On Windows and S3, object write fails outright;
  • On macOS : replaced with /
  • Incremental backup scripts (tar, zip) skip affected files.

Expected Result

Archive naming should be POSIX- and S3-compatible, portable across environments and reversible back to shard identifiers without lossy transformations.

Impact

  • Cross-platform incompatibility: Windows, macOS, S3, and container volumes cannot store files with colons.
  • Non-reversible backups: Restoring archives changes shard identifiers, breaking deterministic reconstruction.
  • Toolchain breakage: rsync, tar, zstd archivers, and CI pipelines fail on invalid paths.
  • Ecosystem risk: third-party operators and automated tooling (indexers, explorers, validators) become environment-dependent.

Proposed Fix

Backward-compatible normalization of filenames:

  1. get_package_file_name: replace : with _ when writing new archive packages
    Example: archive.00000.0_8000000000000000.pack
  2. ArchiveSlice::add_package: accept both forms
  • If : found -> load package as before
  • If _ found -> treat identically
  1. Migration script: optional one-time rename for existing archives
  2. Guideline update: note the safe character set for all future produced file names.

Justification

This is not stylistic. It violates the invariants of reproducible archival and cross-platform determinism:

  • POSIX and S3 are baseline targets for archive node operators.
  • File naming must satisfy both. A colon in filenames fails that requirement.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions