Skip to content

new package hash format: $name-$semver-$hash #20178

Open
@andrewrk

Description

Make package hashes generally more user-friendly, so that it is more practical to interact with package directories on the file system, as well as interact with stack traces, debuggers, and other tooling that uses source code paths.

The current hash format is a hex-encoded multihash SHA-256.

It looks like this:

andy@bark ~> ls ~/.cache/zig/p/
122004fa7e2ff0b3d472049743358f8fdf065cdf63bc0e5e3d54c6bb8d81d93e40da
1220060f743248be7cb57396b491a92e63403afb1d28fff6d1ff5fb06124b008a25e
1220138f4aba0c01e66b68ed9e1e1e74614c06e4743d88bc58af4f1c3dd0aae5fea7
122032707cdf94da394e309978146ee33c61a285300eeb916928af376ec1638a95f1
122048db601b6da2c69d0d783b0b19ff132e9a6d69b77820351d11c6e57553ac9433
12204cfebcccb9fb8a5c7b4a6ec663aea691d180f7d346d36f213b4e154a6be1f823
122050e58ca4d57f5e2cc3d6404691d3040bbe41e76e4ef93b52f2105f1157f7d429
122074e0bf09c3622780e697c11c6744e763dd63777e480baf2b583ee3ab6a02ff14
12207c40cefa38fe90e4230dfba2e5c76b37e1ee36602512cad8ff0501f892002a65
12207d353609d95cee9da7891919e6d9582e97b7aa2831bd50f33bf523a582a08547
1220884c1636f0e6dc92b6e74b97a2d25fe240a77bab9fed3af3e1581f80c3e7256f
12208b3c98b4dcc88608e65889abc853f625a06edbb835da90d902bed1ade4da0ac8
12209083b0c43d0f68a26a48a7b26ad9f93b22c9cff710c78ddfebb47b89cfb9c7a4
1220958bf550739591e62cd55fcd2009e72f9bd6c8168ceb7ad7dd8f92dda0b58a4d
122098b31c5b4412780898de969f7014f5c7d693f10acc8168bff86a811061d829da
1220b3e1fb33317c92f9ead09630f6b4be59e80d0a8780754f8aa4ee7da61cb7b47a
1220bee0fcf98bf6ad75b7bb09ff1f873ca38547a15b1e7a4532d20d94107d8d330a
1220c4a15f871f0784113c34e92e57b2862e7f678a467e5d246a6f2ebfadfca8d116
1220d9c400445c9c3ed46f71ebdbc364b7b349473231884c2f6e540817d7b68553ae
1220db11bb50364857ec6047cfcdf0938dea6af3f24d360c6b6a6103364c8e353679
1220dd6f0bbf4614f338d632473e4b0a879ec26eca445ed305dcdbc6b5cb6405e3cd
1220e783088aadba2eb7324e8dce8c6146c888a6835148dbbdc017ec2b6996a7dab8
1220e920d74980c0794a969e1fc0647c863023acbe935ed244a79ff8ec65f2875023
1220f9bd108d1e7097b27d388a7a65effd503598df61e34a2af02be00b22af567fc7

After this proposal, it would look like this instead:

andy@bark ~> ls ~/.cache/zig/p/
nasm-2.16.1-2-BWdcABvF_jM1
libsoundio-2.0.1-7-BmEKAAr47fud
zlib-1.3.1-3-IQwAAPXlgi9M
libffmpeg-7.0.1-3-ReEHBD4IapnL
StaticHttpFileServer-0.0.0-iDYAACr46GhU
mime-1.0.0-TSAAAANL2H_R
EiBQ5Yyk1X9eLMPWQEaR0wQLvkHnbk75O1LyEF8RV_fU
libvorbis-1.3.8-3-NQ8kAD5eWxrE
openssl-3.3.1-1-KLdkAMs-vt5n
EiB9NTYJ2VzunaeJGRnm2Vgul7eqKDG9UPM79SOlgqCF
libvorbis-1.3.8-2-Hw8kALYtGBJ0
mime-2.0.0-jiAAAL-BobCs
mime-2.0.1-hCAAAC1FfNe4
StaticHttpFileServer-0.0.0-ozYAAOnhf9Zq
cpython-3.11.4-cW8vBMOZSHPt
EiCz4fszMXyS-erQljD2tL5Z6A0Kh4B1T4qk7n2mHLe0
EiC-4Pz5i_atdbe7Cf8fhzyjhUehWx56RTLSDZQQfY0z
pulseaudio-16.1.1-2-kVA2ABuZh0op
EiDZxABEXJw-1G9x69vDZLezSUcyMYhML25UCBfXtoVT
StaticHttpFileServer-0.0.0-eTYAAFRBXp0H
libffmpeg-7.0.1-3-7dgHBCZFa3DD
libsoundio-2.0.1-7-7VwKAIRNMw_X
EiDpINdJgMB5SpaeH8BkfIYwI6y+k17SRKef+Oxl8odQ
nasm-2.16.1-2-J2lcAPu-2VWT

This proposal is to change the hash format to $name-$semver-$sizedhash where:

  • name is the name field from build.zig.zon, limited to 32 bytes based on new rules outlined below
  • semver is the version field from build.zig.zon, limited to 32 bytes based on new rules outlined below
  • sizedhash is the following 9-byte array, base64 encoded using -_ to make it filesystem safe

Package names gain new rules:

  • Limited to 32 bytes
  • Must be a legal unquoted identifier in Zig source code (/[A-Za-z_][A-Za-z0-9_]*/)
    • Package names therefore will be represented as an enum literal in .zon format. Using a string will be deprecated.

The version field gains new rules:

  • Limited to 32 bytes

Packages gain new rules:

  • Limited to total file bytes of 4 GiB or less
    • ...or, should the size field saturate for packages bigger than this?

Packages which lack a build.zig.zon file will have a $hashiname-P-$sizedhash scheme instead:

  • hashiname is [5..][0..24] bytes of the SHA-256, fss-base64-encoded, for a total of 32 bytes encoded
  • the semver section is replaced with a hardcoded P which stands for "Pristine Tarball" or whatever you want, really. It acts as a version number so that any future updates to the hash format can tell this hash format apart. Note that "P" is an invalid semver.
  • sizedhash is the following 9-byte array, fss-base64-encoded
    • (4 bytes) LE u32 total decompressed size in bytes
    • (5 bytes) first 5 bytes of the SHA-256 of hashed files of the package

The hash is broken up this way so that "sizedhash" can be calculated exactly the same way in both cases, and so that "name" and "hashiname" can be used interchangeably in both cases.

Related Future Work

Compatibility

Let's try to keep compatibility with the old hash format for at least 1 release cycle, so that there is 1 release cycle that supports both the old and new format at the same time.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    acceptedThis proposal is planned.proposalThis issue suggests modifications. If it also has the "accepted" label then it is planned.zig build systemstd.Build, the build runner, `zig build` subcommand, package management

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions