Skip to content

Support IPIP-499 UnixFS CID Profiles #941

@lidel

Description

@lidel

Summary

Add support for IPIP-499 CID profiles (unixfs-v1-2025 and unixfs-v0-2015) to enable deterministic CID generation across IPFS implementations.

Current State

Helia's @helia/unixfs already uses settings close to unixfs-v1-2025:

  • CIDv1, sha2-256, raw leaves
  • 1 MiB chunk size
  • 1024 links per node (DAG width)
  • 256 block HAMT fanout

However, the HAMT sharding threshold estimation uses links-bytes (sum of link name + CID lengths) instead of block-bytes (full serialized dag-pb size). This causes CID mismatches when directories are near the 256 KiB threshold.

See: packages/unixfs/src/commands/utils/is-over-shard-threshold.ts

Required Changes

@achingbrain below are just broad strokes prototypes,. feel free to adjust names/api to be idiomatic to what helia does

1. Update HAMT threshold estimation to block-bytes

The estimateNodeSize() function currently sums link name and CID byte lengths. For unixfs-v1-2025 compliance, it should use the full serialized block size:

// Current (links-bytes):
size += link.Name.length + link.Hash.bytes.byteLength

// Needed (block-bytes):
size = dagPb.encode(node).byteLength

This affects:

  • is-over-shard-threshold.ts
  • any other code checking directory size for HAMT conversion

2. Add profile option

Add a single profile option that applies all relevant settings internally:

// simple usage - just pick a profile
const cid = await fs.addBytes(data, { profile: 'unixfs-v1-2025' })

// or use the legacy profile for CIDv0 compatibility
const cid = await fs.addBytes(data, { profile: 'unixfs-v0-2015' })

Users who need custom behavior can still override individual settings:

// start with a profile, then tweak specific knobs
const cid = await fs.addBytes(data, {
  profile: 'unixfs-v1-2025',
  chunker: fixedSize({ chunkSize: 512 * 1024 })  // override just this one setting
})

3. Expose individual knobs for advanced users

The shardSplitThresholdBytes option exists but there's no way to choose between links-bytes and block-bytes estimation. Add a hamtSizeEstimation option so advanced users can control this independently of profiles.

4. Add IPIP-499 test vectors

Add tests that verify CIDs match the spec fixtures for both profiles. See the Test fixtures section in IPIP-499 for reference CIDs covering small files, multi-level DAGs, and HAMT threshold boundary cases.

5. Verify HAMT threshold comparison uses >

The threshold comparison should be strictly greater than (>), not >=. A directory exactly at 262144 bytes remains a basic directory.

Current code in is-over-shard-threshold.ts line 31 uses size > threshold which is correct (so far, double check if we did not flop-flop and decided its GO thats wrong and JS did the right thing)

Profile Comparison

Setting unixfs-v0-2015 unixfs-v1-2025
CID Version 0 1
Raw Leaves false true
Hash sha2-256 sha2-256
Chunk size 256 KiB 1 MiB
DAG width 174 1024
HAMT fanout 256 256
HAMT threshold 256 KiB 256 KiB
HAMT estimation links-bytes block-bytes

Related Work

Metadata

Metadata

Assignees

Labels

P1High: Likely tackled by core team if no one steps updif/expertExtensive knowledge (implications, ramifications) requiredeffort/daysEstimated to take multiple days, but less than a weekiteration/2026-q1On maintainer radar for Q1 2026

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions