implement new package hash format: `$name-$semver-$hash` #22994

andrewrk · 2025-02-24T00:35:57Z

Legacy format is also supported.

closes #20178

Tested building ffmpeg with a cache of old packages
Tested with a wiped cache
Tested updating only one dependency to new hash format
Tested deleting hash field and getting the "expected hash" error
Tested name using string literal valid and invalid, enum literal with and without @
Tested zig fetch on legacy package

Although this change allows the previous hash format, it is breaking because it institutes the new package naming rules outlined in the linked issue. In practice, I expect this to affect only those who put "-" in the package name.

Followup Issues

investigate case sensitivity on Windows. I think this is justified to be a follow-up issue because even with case insensitivity, it only means 118 bits rather than 200 (200 * ((64-26) / 64)). Issue is fixable with an additional check at worst, marking the directory case sensitive at best.
allow modifying name, version, paths for naked package URLs (this will affect the hash)
allow omitting paths field, making makes the package unfetchable. Then disallow .paths = .{""}. Add basic globbing support.
allow omitting name, nonce, and version fields when paths field is omitted (unfetchable package). Do this for zig's own build.zig.zon

Rexicon226 · 2025-02-24T03:31:09Z

src/Package/Manifest.zig

+        if (p.allow_name_string and node_tags[node] == .string_literal) {
+            const name = try parseString(p, node);
+            if (!std.zig.isValidId(name))
+                return fail(p, main_token, "name must be a valid bare zig identifier (hint: switch from string to enum literal)", .{});


Maybe I'm missing something here, but why? We have @"" specifically for these usecases.

#20178 (comment)

Hmm, fair enough!

xdBronch · 2025-02-24T03:39:53Z

Legacy format is also supported.

indefinitely? maybe soft deprecate with a tool to migrate zon files automatically?

andrewrk · 2025-02-24T03:49:06Z

I will integrate it with #22898 when that patchset lands.

ifreund

I still think that using - as the separator here is a mistake. Since the semver spec also uses - as a separator, splitting the $name-$semver-$hash string on - is a bug while it wouldn't have to be if we chose a different separator that is neither allowed in bare zig identifiers nor semver versions.

One possible separator is the tilde character, ~ which is safe for use in file names on Linux, MacOS, and Windows at the very least (as long as a tilde is not the first character, which is not the case here).

Other options that come to mind are the comma ,, the octothorpe #, the percent sign %, and the carat ^.

Aesthetically, I personally find either the tilde ~ or the comma , most pleasing, though I would prefer any of the options I listed over - from a technical standpoint.

cryptocode · 2025-02-24T09:51:02Z

Since the semver spec also uses - as a separator, splitting the $name-$semver-$hash string on - is a bug while it wouldn't have to be if we chose a different separator that is neither allowed in bare zig identifiers nor semver versions.

The base64 hash part may also contain - and _

But what's the use-case for parsing this, especially since components are truncated? The ambiguity in the format might be a feature so people don't start depending on a specific format.

ifreund · 2025-02-24T10:38:26Z

src/Package.zig

+    /// * sizedhash is the following 9-byte array, base64 encoded using -_ to make
+    ///   it filesystem safe:
+    ///   - (4 bytes) LE u32 total decompressed size in bytes
+    ///   - (5 bytes) truncated SHA-256 of hashed files of the package


Is truncating the sha256 hash to 40 bits really ok? My understanding is that while truncating the result of a "good" hash function like sha256 does not reduce the "strength" of the hash (i.e. difficulty of finding the pre-image for a given hash), it does significantly increase the likelihood of collisions.

I am by no means an expert, but it seems like truncating to 40 bits puts a Birthday attack in the realm of possibility.

Either way, I think it is important to have someone who has a much smaller set of cryptography "unknown unknowns" than you or I think about this.

Regardless of whether or not it is ok to use only 40 bits, we should really document the reason of why it is ok or why it is not ok in this comment.

I agree with this. Even when considering name, semver and decompressed size as part of the hash, this doesn't make it much harder to find collisions.

Suppose someone generates 2^20 good versions of a package, e.g. by modifying code comments, changing variable names or whatever, but keeping name, semver and total size the same. With 40 bit hashes, there will be few collisions, so there will be about 2^20 distinct hashes. Now they make a malicious change to the package again while keeping name, semver, decompressed size the same. They generate 2^20 malicious versions and now the chance of a good version and a malicious version sharing the same hash is about 1-(1-2^(-20))^(2^20), which is around 63%. They could ship the good version first and switch to the malicious version after any code reviews are done.

I doesn't take that long to generate and compute the hashes of two million minor modifications of a package.

I find these arguments convincing, and, thinking about it, now believe that we probably shouldn't be truncating the hash at all. It's an added risk for at best dubious benefit.

Also note that Windows is case-insensitive, so the likelihood of running into bugs from conflicting base64url-encoded hashes is even higher.

@castholm Case-insensitivity applies to access, but case information is still stored on all supported types of file systems, right?
So zig can specifically check the case-sensitive name of the directory in the path after opening the directory handle, safely retrieve/depend on that information, and error on case mismatch, right?
This does require extra logic (and potentially an extra syscall) though.

how about sha-256 truncated to 192 bits?

how about sha-256 truncated to 192 bits?

I think 192 bits should be sufficient to make the birthday attack described above infeasible. This changes the order of magnitude of the variants that would need to be brute-force checked from 2^20 to 2^96.

A few million is a small number for a computer, a few octillion is definitely not small. To put than in perspective, if a 5GHz CPU core checked one variant per cycle it would take 500 billion years to check 2^96 variants.

A minor point, but if you still end up using Base32, then truncating to a multiple of 5 bits makes sense.

a multiple of 5 bits

For instance, 200 bits are a whole number of bytes and a multiple of 5.

src/Package.zig

castholm · 2025-02-24T17:06:00Z

I still think that using - as the separator here is a mistake. Since the semver spec also uses - as a separator, splitting the $name-$semver-$hash string on - is a bug while it wouldn't have to be if we chose a different separator that is neither allowed in bare zig identifiers nor semver versions.

This is only a problem if being able to split on a single character is an explicit design goal (e.g. using std.mem.splitScalar(u8, hash, '-')). Both hash formats introduced in this PR can be unambiguously told apart and broken down into their respective components:

If the hash contains ., it's a named hash, otherwise, it's naked.
For named hashes, everything up until the first - is the name, the last 9 bytes is the sizedhash, and hash[(name.len + 1)..(hash.len - 10)] is the semver.
For naked hashes, the last 9 bytes is the sizedhash, and hash[0..(hash.len - 12)] is the hashiname.

castholm · 2025-02-24T17:13:32Z

src/Package.zig

+    /// * name is the name field from build.zig.zon, truncated at 32 bytes and must
+    ///   be a valid zig identifier
+    /// * semver is the version field from build.zig.zon, truncated at 32 bytes


Why truncate the package name and version instead of hard-limiting them to 32 bytes and checking the lengths when validating the manifest?

andrewrk · 2025-02-24T19:51:50Z

Where were all these great points during the issue discussion in June! 🙂

nektro · 2025-02-24T20:24:59Z

#20178 (comment)
#20178 (comment)

andrewrk · 2025-02-24T21:08:10Z

ehh the disagreement about - is not important. This is a hash - a one way function, by definition - it is not meant to be parsed, as pointed out by @cryptocode. Even so, if it makes @ifreund happy then I don't see any reason not to use ~. That's a trivial change that does not make me do any extra work, really.

I think the main challenge here is the file system case insensitivity on Windows. I can't believe I didn't consider that until now. Yeesh.

cbilz · 2025-02-24T21:20:18Z

I think the main challenge here is the file system case insensitivity on Windows.

Base 32 encoding is just 20% longer, but I'm not sure about what other constraints there are.

andrewrk · 2025-02-24T21:20:43Z

I feel like I tried this once but unfortunately I don't remember the results of my investigation. It looks like you can omit OBJ_CASE_INSENSITIVE when dealing with files on Windows.

andrewrk · 2025-02-24T22:24:16Z

>>> 4 * 'm' + 64 * 'h' # status quo length
'mmmmhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh'
>>> 32 * 'n' + '-' + 32 * 'v' + '-' + ceil((32 * 8) / 4) * 'h' # new max length with hex-encoded hash, no size
'nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn-vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv-hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh'
>>> 32 * 'n' + '-' + 32 * 'v' + '-' + ceil((32 * 8) / 5) * 'h' # new max length with base32-encoded hash, no size
'nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn-vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv-hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh'
>>> 32 * 'n' + '-' + 32 * 'v' + '-' + ceil((32 * 8) / 6) * 'h' # new max length with base64-encoded hash, no size
'nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn-vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv-hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh'
>>> 32 * 'n' + '-' + 32 * 'v' + '-' + ceil(((32+4) * 8) / 4) * 'h' # new max length with hex-encoded hash, size included
'nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn-vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv-hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh'
>>> 32 * 'n' + '-' + 32 * 'v' + '-' + ceil(((32+4) * 8) / 5) * 'h' # new max length with base32-encoded hash, size included
'nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn-vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv-hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh'
>>> 32 * 'n' + '-' + 32 * 'v' + '-' + ceil(((32+4) * 8) / 6) * 'h' # new max length with base64-encoded hash, size included
'nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn-vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv-hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh'
>>> "nasm-2.16.1-3-" + ceil(((32+4) * 8) / 4) * 'h' # real world example, hex-encoded hash, size included
'nasm-2.16.1-3-hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh'
>>> "nasm-2.16.1-3-" + ceil(((32+4) * 8) / 5) * 'h' # real world example, base32-encoded hash, size included
'nasm-2.16.1-3-hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh'
>>> "nasm-2.16.1-3-" + ceil(((32+4) * 8) / 6) * 'h' # real world example, base64-encoded hash, size included
'nasm-2.16.1-3-hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh'
>>> "nasm-2.16.1-3-" + ceil((192 + 32 + 16) / 4) * 'h' # real world example, hex-encoded hash, 192bit truncate, size and id included
'nasm-2.16.1-3-hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh'
>>> "nasm-2.16.1-3-" + ceil((192 + 32 + 16) / 5) * 'h' # real world example, base32-encoded hash, 192bit truncate, size and id included
'nasm-2.16.1-3-hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh'
>>> "nasm-2.16.1-3-" + ceil((192 + 32 + 16) / 6) * 'h' # real world example, base64-encoded hash, 192bit truncate, size and id included
'nasm-2.16.1-3-hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh'

Anything you can't see above without scrolling makes me real uncomfortable, even though technically all of these will easily fit in the 255 limit for path components.

andrewrk · 2025-02-24T22:53:21Z

I'm going to push back against the anti-dash argument.

these are hashes, they do not need to be parsed
even so, they can always be correctly parsed by parsing the name first, then the hash which has a fixed size, then the semver last.
tilde is a pain in the ass to type on some keyboards. dash is always easy for everybody.
$name-$version is a commonly seen combination in software development. users will experience cognitive dissonance if this pattern is broken.

castholm · 2025-02-25T02:24:42Z

I feel like I tried this once but unfortunately I don't remember the results of my investigation. It looks like you can omit OBJ_CASE_INSENSITIVE when dealing with files on Windows.

On Windows 10 version 1803 and later (April 2018) you can use NtSetInformationFile to specify FileCaseSensitiveInformation and mark a directory as case-sensitive. I have no idea how well this would work in practice, and seeing as the difference between base64 and base32 is only ~10 characters it's probably not worth the hassle. There's also the caveat that there are still some LTSC versions from before 1803 that are officially supported through 2026.

andrewrk · 2025-02-25T05:21:08Z

If you're participating in this discussion, please read the commit messages of the newly pushed commits, particularly the one that introduces id.

Here are all the discussion points thus far:

error instead of truncate to 32 bytes for name and version
- implemented suggestion
suggestion to use tilde or other separator rather than dash
- argument against: implement new package hash format: $name-$semver-$hash #22994 (comment)
case insensitivity on windows
- not addressed yet. I plan to experiment with OBJ_CASE_INSENSITIVE or maybe that FileCaseSensitiveInformation thing.
weakness of truncating SHA-256 to 40 bits
- bumped to 192 bits
- point of contention: it's still truncated
base32 suggestion
- argument against: based on implement new package hash format: $name-$semver-$hash #22994 (comment) I think these paths are uncomfortably long for showing up in compile errors

alexrp · 2025-02-25T05:58:00Z

doc/build.zig.zon.md

+### `id`
+
+Together with name, this represents a globally unique package identifier. This
+field should be initialized with a 16-bit random number when the package is


I think "should" is too vague when other fields have a definitive "required" or "optional".

Elsewhere, the word "must" and "required" indicate that an error will occur. Here, I think the word "should" is appropriate because there is no error for not doing the thing, instead there are social consequences.

prior art https://datatracker.ietf.org/doc/html/rfc2119

andrewrk · 2025-02-25T06:08:47Z

OBJ_CASE_INSENSITIVE

Does nothing. On Windows 10.0.19045, I tested creating and opening files via NtCreateFile without OBJ_CASE_INSENSITIVE enabled. In both cases, the file system operated in a case-insensitive manner. I observed the NtCreateFile call in procmon.exe and noted that the string used inside the application (i.e. "hello" vs "HELLO") was not reflected in the path reported by procmon.exe. procmon.exe always reported a path string matching what was already on disk, not what was sent to the NtCreateFile call, unless there was no file on disk, in which case the string would match the data from the application.

I did not experiment with FileCaseSensitiveInformation yet.

castholm · 2025-02-25T06:44:08Z

src/Package/Fetch.zig

        .allow_missing_paths_field = f.allow_missing_paths_field,
+        .allow_missing_id = f.allow_missing_paths_field,
+        .allow_name_string = f.allow_missing_paths_field,


I tried building the branch and running zig fetch to re-fetch and re-hash packages and it failed with an "expected enum literal" error.

cmdFetch sets .allow_missing_paths_field = false, so if the intention is for users to be able to fetch legacy packages until they are wholly deprecated, you might want to break this out into a separate field.

Good find; fixed in a commit to be pushed shortly.

castholm · 2025-02-25T06:46:24Z

src/main.zig

+        else => '_',
+    });
+    for (bytes[1..]) |byte| switch (byte) {
+        '_', 'a'...'z', 'A'...'Z', '0'...'9' => try result.append(arena, byte),


Suggested change

'_', 'a'...'z', 'A'...'Z', '0'...'9' => try result.append(arena, byte),

'_', 'a'...'z', 'A'...'Z', '0'...'9' => try result.append(arena, byte),

'-', '.' => '_',

- and . are somewhat common in repo/project names, so substituting them with _ instead of ignoring them probably gets you closer to what the user actually wants.

castholm · 2025-02-25T06:49:56Z

src/Package.zig

+    /// * name is the name field from build.zig.zon, truncated at 32 bytes and must
+    ///   be a valid zig identifier
+    /// * semver is the version field from build.zig.zon, truncated at 32 bytes


Suggested change

/// * name is the name field from build.zig.zon, truncated at 32 bytes and must

/// be a valid zig identifier

/// * semver is the version field from build.zig.zon, truncated at 32 bytes

/// * name is the name field from build.zig.zon, limited to 32 bytes and must

/// be a valid zig identifier

/// * semver is the version field from build.zig.zon, limited to 32 bytes

Thanks, I'm changing it to use the words "assert" and "assume" in accordance with the doc comment guidance

castholm · 2025-02-25T07:23:32Z

Regarding id, it might be worth revisiting @mlugg's comments in #20183 (comment) about unintentional id duplication.

Inevitably, inexperienced users are going to copy/paste ids from references or tutorials, or have LLMs generate them, or hand-pick "cute" values like 0x0420 or 0xcafe, so there's a non-zero chance that conflicting package names and ids will happen in the wild even without any malicious intent from the authors. Baking a checksum into the id as suggested in the discussion in the linked issue helps a bit, e.g. a user following a tutorial will likely change the name field, which will trigger an error and force them to refresh the id.

ifreund · 2025-02-25T09:43:12Z

I'm going to push back against the anti-dash argument.

I've been thinking about this more and I'm OK with dashes now, I hereby retract the anti-dash argument, let's focus on the more important things.

One thing I think is important that I haven't seen brought up yet is forwards-compatibility/extensibility. I think we can spare a single character somewhere in the format to act as a "format version number." I'd argue there's a non-zero chance that we end up wanting to change something about this format in the future, and I'm sure we would be very happy to have a version number if/when that happens.

andrewrk · 2025-02-26T00:09:49Z

Regarding ID checksum: it took me a while to understand the workflow @mlugg is trying to protect against, since changing the name does create a new logical id (logical id is name + id).

The problematic workflow is:

Someone creates a template with build.zig.zon, id field included (note that zig init does not create this problem since it generates fresh id every time it runs).
User A uses the template, changing package name to "example" but not id field.
User B uses the same template, changing package name also to "example", also not changing the id field.

Now both packages have unintentional conflicting logical ids.

The checksum idea is reasonable, if I understand it correctly. It would only possibly cause a manifest validation error, while the actual ID bytes would still be a u16.

Given that this field of the manifest is technically not an id, but in fact:

one component of the tuple that composes the id
includes a checksum mixed in

I think it could potentially be named better. Maybe "nonce" actually works pretty well?

Example error message:

error: invalid nonce: 0xcafebabe
note: if this is a new or forked package, use this nonce: 0xda8b2180

andrewrk · 2025-02-26T02:01:51Z

forwards-compatibility/extensibility. I think we can spare a single character somewhere in the format to act as a "format version number."

I thought about this, and concluded that the future format can add such character with equal utility. For instance, it could put a character outside the base64 charset at package_hash[package_hash.len - 43].

ifreund · 2025-02-26T09:38:39Z

I hate to open another bikeshed, but I don't think "nonce" is an accurate name. It has a very specific meaning in cryptography (wikipedia). The defining characteristic of a nonce is that it is ephemeral and can only be used once.

Since this package identifier is persistent and used repeatedly to identify the package, it is not a nonce.

I personally don't see any issue with using "id" here, it's definitely more accurate than nonce. If you want to convey the fact that it is a unique id in the name, perhaps "uid" or similar would make sense.

Note that the existing and widely used UUID format is named universally unique identifier not universally unique nonce.

andrewrk · 2025-02-26T10:41:45Z

I agree it's important to consider seriously these names that would require breakage to change.

I agree that a cryptographic nonce has a very specific meaning, and that it has to do with being ephemeral and only used once. Regardless of any arguments I make here, I recognize that even just the fact that it seemed wrong to you is a fault against the name. However, I think there is actually a solid argument to make that the word is appropriate.

a nonce is an arbitrary number that can be used just once in a cryptographic communication

If you consider the "communication" to be the entire lifetime of the package, then it's a good fit. But I get that "communication" is typically measured in seconds, minutes, or days at most. Furthermore, it's partially composed of a checksum of the name. I'm afraid this is a new concept that does not have a premade nice name for us.

In this case it might be justified to create one. Cryptography had to create the word "nonce", "salt", "hash", etc.

The reason I don't want to use "id" - although I'm not strictly vetoing the name - is that there is already the concept of a package id. It's whatever this thing is combined with name. It's what I called a "logical id" earlier. However, the Zig programmer does not specify package id directly; they specify the name component, and then rely on tooling to autogenerate the random bits component. It would be nice when talking about package ids to not have to disambiguate between id field and logical package id.

In conclusion, I'm fully open to name counter-proposals, however, I also think that, while not perfect, "nonce" is defensible.

Edit: some names that my graphics card came up with:

token
entropy
fingerprint
salt (I mentioned this above; not technically in agreement with wikipedia one-sentence summary of "random data fed as an additional input to a one-way function that hashes data, a password or passphrase")

Final edit:

I still standby nonce as a viable name. However, I offer this "new concept" name: lift

It's short for "liftoff" which is short for "takeoff every zig", and also matches @kristoff-it's book title ("Zig Liftoff").

The name indicates that it is appropriate when "launching". I.e. new project, or a fork.

tests should use the API, not only verify compilation succeeds.

legacy format is also supported. closes #20178

This branch regressed from master by switching to binary rather than hex digest, allowing null bytes to end up in identifiers in the zig file. This commit fixes it by changing the "hash" to be literally equal to the sub_path (with a prefix '/' to indicate "global") if it can fit. If it is too long then it is actually hashed, and that value used instead.

Introduces the `id` field to `build.zig.zon`. Together with name, this represents a globally unique package identifier. This field should be initialized with a 16-bit random number when the package is first created, and then *never change*. This allows Zig to unambiguously detect when one package is an updated version of another. When forking a Zig project, this id should be regenerated with a new random number if the upstream project is still maintained. Otherwise, the fork is *hostile*, attempting to take control over the original project's identity. `0x0000` is invalid because it obviously means a random number wasn't used. `0xffff` is reserved to represent "naked" packages. Tracking issue #14288 Additionally: * Fix bad path in error messages regarding build.zig.zon file. * Manifest validates that `name` and `version` field of build.zig.zon are maximum 32 bytes. * Introduce error for root package to not switch to enum literal for name. * Introduce error for root package to omit `id`. * Update init template to generate `id` * Update init template to populate `minimum_zig_version`. * New package hash format changes: - name and version limited to 32 bytes via error rather than truncation - truncate sha256 to 192 bits rather than 40 bits - include the package id This means that, given only the package hashes for a complete dependency tree, it is possible to perform version selection and know the final size on disk, without doing any fetching whatsoever. This prevents wasted bandwidth since package versions not selected do not need to be fetched.

Adhere to the new rules: 32 byte limit + must be a valid bare zig identifier

mainly this addresses the following use case: 1. Someone creates a template with build.zig.zon, id field included (note that zig init does not create this problem since it generates fresh id every time it runs). 2. User A uses the template, changing package name to "example" but not id field. 3. User B uses the same template, changing package name also to "example", also not changing the id field. Here, both packages have unintentional conflicting logical ids. By making the field a combination of name checksum + random id, this accident is avoided. "nonce" is an OK name for this. Also relaxes errors on remote packages when using `zig fetch`.

and to make the base64 round even, bump sha256 to 200 bits (up from 192)

rohlem · 2025-02-26T12:00:51Z

I quite like salt, entropy, and fingerprint.
I personally think lift is the option that is the least self-explanatory, and will require the most people / everyone to look up and remember its meaning, which in my eyes is a slight negative.
There's also the option of a combination, like lift-entropy, but if id = name + x already, then "package's id salt" already seems descriptive enough.

ifreund · 2025-02-26T12:18:16Z

I'm quite happy with fingerprint, thanks for tolerating my bikeshed with an open mind :)

jedisct1 · 2025-02-26T12:56:20Z

I like fingerprint as well.

andrewrk force-pushed the newhash branch from e60edd9 to f1f71b7 Compare February 24, 2025 01:24

andrewrk added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. release notes This PR should be mentioned in the release notes. labels Feb 24, 2025

alexrp added this to the 0.14.0 milestone Feb 24, 2025

Rexicon226 reviewed Feb 24, 2025

View reviewed changes

andrewrk changed the title ~~new package hash format~~ implement new package hash format: $name-$semver-$hash Feb 24, 2025

ifreund reviewed Feb 24, 2025

View reviewed changes

linusg reviewed Feb 24, 2025

View reviewed changes

src/Package.zig Outdated Show resolved Hide resolved

castholm reviewed Feb 24, 2025

View reviewed changes

andrewrk mentioned this pull request Feb 25, 2025

zig init template: add compiler version to build.zig.zon by default #22698

Closed

andrewrk force-pushed the newhash branch from 0cf39b0 to 49b95b7 Compare February 25, 2025 05:07

alexrp reviewed Feb 25, 2025

View reviewed changes

castholm reviewed Feb 25, 2025

View reviewed changes

andrewrk force-pushed the newhash branch from 49b95b7 to 0ec03c0 Compare February 26, 2025 01:59

andrewrk added 14 commits February 26, 2025 03:24

std.ArrayList: delete unit test

271bc2c

tests should use the API, not only verify compilation succeeds.

Package: new hash format

1849f7a

legacy format is also supported. closes #20178

require package names to be valid zig identifiers

a1c0bd8

Package.Manifest: enforce name limit of 32

ecf35c5

Package.Manifest: enforce maximum version string length of 32

ab19033

zig init: sanitize generated name

37fc74a

Adhere to the new rules: 32 byte limit + must be a valid bare zig identifier

update zig's own manifest file to conform to new rules

d33cd4b

Package: update unit tests to new API

79661dd

CLI: add unit test and improve sanitizeExampleName

8692658

bump package id component to 32 bits

6a06e64

and to make the base64 round even, bump sha256 to 200 bits (up from 192)

zig init: adjust template lang to allow zig fmt passthrough

f9564b1

andrewrk force-pushed the newhash branch from 5150b22 to 92b1bb5 Compare February 26, 2025 12:01

rename "nonce" to "fingerprint"

018cf0a

andrewrk force-pushed the newhash branch from 92b1bb5 to 018cf0a Compare February 26, 2025 12:01

	'_', 'a'...'z', 'A'...'Z', '0'...'9' => try result.append(arena, byte),
	'_', 'a'...'z', 'A'...'Z', '0'...'9' => try result.append(arena, byte),
	'-', '.' => '_',

implement new package hash format: $name-$semver-$hash #22994

Are you sure you want to change the base?

implement new package hash format: $name-$semver-$hash #22994

Conversation

andrewrk commented Feb 24, 2025 • edited Loading

Followup Issues

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xdBronch commented Feb 24, 2025

andrewrk commented Feb 24, 2025

ifreund left a comment • edited Loading

Choose a reason for hiding this comment

cryptocode commented Feb 24, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rohlem Feb 24, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

castholm commented Feb 24, 2025

Choose a reason for hiding this comment

andrewrk commented Feb 24, 2025

nektro commented Feb 24, 2025

andrewrk commented Feb 24, 2025

cbilz commented Feb 24, 2025

andrewrk commented Feb 24, 2025

andrewrk commented Feb 24, 2025 • edited Loading

andrewrk commented Feb 24, 2025

castholm commented Feb 25, 2025

andrewrk commented Feb 25, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrewrk commented Feb 25, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

castholm commented Feb 25, 2025

ifreund commented Feb 25, 2025

andrewrk commented Feb 26, 2025 • edited Loading

andrewrk commented Feb 26, 2025 • edited Loading

ifreund commented Feb 26, 2025

andrewrk commented Feb 26, 2025 • edited Loading

rohlem commented Feb 26, 2025 • edited Loading

ifreund commented Feb 26, 2025

jedisct1 commented Feb 26, 2025

implement new package hash format: `$name-$semver-$hash` #22994

implement new package hash format: `$name-$semver-$hash` #22994

andrewrk commented Feb 24, 2025 •

edited

Loading

ifreund left a comment •

edited

Loading

rohlem Feb 24, 2025 •

edited

Loading

andrewrk commented Feb 24, 2025 •

edited

Loading

andrewrk commented Feb 25, 2025 •

edited

Loading

andrewrk commented Feb 26, 2025 •

edited

Loading

andrewrk commented Feb 26, 2025 •

edited

Loading

andrewrk commented Feb 26, 2025 •

edited

Loading

rohlem commented Feb 26, 2025 •

edited

Loading