enforce import/embed case sensitivity #23163

marler8997 · 2025-03-09T18:25:35Z

squeek502 · 2025-03-09T20:02:04Z

Prior art: #11655

lib/std/fs/path.zig

squeek502 · 2025-03-10T03:06:04Z

src/Zcu/PerThread.zig

+        if (!std.ascii.eqlIgnoreCase(import_component.name, real_path_component.name)) {
+            std.debug.panic("real path '{s}' does not end with import path '{s}'", .{ real_path, import_string });
+        }


This will incorrectly fail for non-ASCII paths that have different casing (i.e. кириллица vs КИРИЛЛИЦА). You can use std.os.windows.eqlIgnoreCaseWtf8 instead if you want to keep the check (it can technically be used on any system, but pulls in extra uppercasing data on non-Windows systems), but not sure if that'd be foolproof either since that case-insensitivity is Windows-specific.

Ok I've added an eqlIgnoreCase function that will forward to std.os.windows.eqlIgnoreCaseWtf8 for windows, otherwise, it uses a utf8 iterator to compare paths by codepoint. It only considers ascii as far as upper/lowercase equivalent....I didn't find a non-windows specific way check two codepoints for equivalence ignoring case outside of using ascii.

I'm curious what happens on linux if you mount a case insensitive filesystem like vfat/NTFS....does it use the same upper/lower case equivalence as windows or is it different?

Honestly, I'd suggest avoiding this entire comparison. As mentioned, it'd have to assume things about the details of the case-insensitivity to avoid false positives, which doesn't seem like a good idea.

I see your point. However, this check helps ensure our case sensitivity check doesn't have "false negatives". If there's a problem with what we're doing, this check will make sure the compiler crashes rather than have a false negative check (compiler doesn't detect the case mismatch and think everything is OK). I'd prefer the compiler to crash in this case so we can fix it rather than allow the bug to do nothing.

With this check in place we ensure that eventually the implementation will become robust, without it it's much less likely.

I believe this check will also panic when using @import/@embedFile on a symlink, e.g. this currently successfully imports link.zig which is a symlink to target.zig:

$ cat test.zig comptime { _ = @import("link.zig"); } ` $ ls -la lrwxrwxrwx 1 ryan ryan 10 Mar 11 16:37 link.zig -> target.zig -rw-rw-r-- 1 ryan ryan 19 Mar 11 16:38 target.zig -rw-rw-r-- 1 ryan ryan 127 Mar 11 16:38 test.zig $ cat target.zig comptime { @compileLog("foo"); } $ zig build-obj test.zig link.zig:2:5: error: found compile log statement @compileLog("foo"); ^~~~~~~~~~~~~~~~~~

But calling realpath on link.zig will return the path to target.zig, which will trip this panic.

(I'm unable to find a relevant issue about the intended behavior with regards to @import and following symlinks, so I'm not sure if the current behavior is intentional)

(note also that this even works if target.zig is outside the main package path, e.g. if link.zig is a symlink to ../target.zig; unsure if that's intended, too, but that would be a separate issue)

Here's an idea for a potential alternate implementation that should avoid symlink-related complications:

Call openDir(dirname(cur_sub_path) orelse ".", .{ .iterate = true })

Iterate the directory, checking each Entry.name for strict equality to the basename of cur_sub_path

If nothing matched, then this is a case mismatch error for the current basename component (but note that it technically also could be a TOCTOU error if a file with the canonical casing did exist but was deleted while this check was occuring)

Chop off the last component from the cur_sub_path and continue the loop

Break after opening . and checking the leftmost component of the cur_sub_path

(could also go left-to-right through the path instead)

I'm sure this could be optimized for certain systems, e.g. on Windows NtQueryDirectoryFile could probably be used to avoid the directory iteration.

(I'll be able to properly explore this idea sometime after the next ~2 weeks FWIW)

Ah...for some reason I had it in my head that zig's realpath function didn't actually resolve symlinks but I see this is false. I'll think on this and your suggestion to come up with a solution.

Another option would be to open a fd to sub_path without following symlinks and then call std.os.getFdPath on it, but:

Getting a fd of a symlink is not supported on every platform (there's nothing in std.fs that supports it, but here's a test that does it for a few platforms)

std.os.getFdPath is not supported on every platform (and cannot be on some, e.g. WASI) and Andrew doesn't particularly like that function, see this comment

I don't love any of the solutions so far. Here's some more thoughts I've had to share in the meantime.

On windows this is easy to implement because it has APIs to get the original case of a file path without resolving symlinks, I'm guessing macos might have similar APIs we could leverage. Linux can have case insensitive filesystems, but, there don't seem to be any APIs available that allow you to retrieve the original case of a file path, other than an exhaustive directory listing/compare and who knows what upper/lowercase scheme it's using. It feels a bit like we're fighting against how the system is intended to be used.

Your idea to verify each part of an import subpath by listing each directory and ensuring there's an exact match should work but seems expensive. We could make it even more complicated by doing a check that verifies whether that directory is case sensitive first and if not we can skip the directory listing. This could be done by opening the file/directory, which we would do anyway, then, swapping one of the characters in that basename for their upper/lowercase counterpart, open that file/directory again and compare their inode number. If it's a match, which I expect would be rare, then we continue with the directory listing.

To reiterate not sure if any of this is worth it but leaving the comment for future reference. I think I might play around with a windows-only implementation (maybe macOS too) and see how that looks. I expect just supporting those 2 platforms would cover over 99% of real world cases and this might be one of those times where the tradeoff for that extra 1% isn't worth it.

closes ziglang#9786

squeek502 · 2025-03-11T20:07:21Z

src/Zcu/PerThread.zig

+    var a_utf8_it = std.unicode.Utf8View.initUnchecked(a).iterator();
+    var b_utf8_it = std.unicode.Utf8View.initUnchecked(b).iterator();
+    while (true) {
+        const a_cp = a_utf8_it.nextCodepoint() orelse break;
+        const b_cp = b_utf8_it.nextCodepoint() orelse return false;
+        if (a_cp != b_cp) {
+            const a_upper = std.ascii.toUpper(std.math.cast(u8, a_cp) orelse return false);
+            const b_upper = std.ascii.toUpper(std.math.cast(u8, b_cp) orelse return false);
+            if (a_upper != b_upper)
+                return false;
+        }
+    }
+    if (b_utf8_it.nextCodepoint() != null) return false;


Note that this codepoint iteration loop has no practical difference to std.ascii.eqlIgnoreCase (besides doing a lot of extra work), since UTF-8 multibyte codepoints will never contain bytes within the ASCII range. In other words, its always safe to iterate over bytes with UTF-8 if you only care about the ASCII range.

(but see #23163 (comment))

marler8997 · 2025-04-11T19:45:05Z

Closing for now, maybe I'll come back to this later

squeek502 reviewed Mar 9, 2025

View reviewed changes

lib/std/fs/path.zig Outdated Show resolved Hide resolved

squeek502 reviewed Mar 10, 2025

View reviewed changes

marler8997 force-pushed the caseSensitiveImports branch from d82af61 to 61f383e Compare March 11, 2025 00:23

enforce import/embed case sensitivity

d94b4bb

closes ziglang#9786

marler8997 force-pushed the caseSensitiveImports branch from 61f383e to d94b4bb Compare March 11, 2025 00:59

squeek502 reviewed Mar 11, 2025

View reviewed changes

marler8997 marked this pull request as draft March 12, 2025 15:37

marler8997 closed this Apr 11, 2025

marler8997 mentioned this pull request May 25, 2025

avoid being tricked that Build.zig is build.zig #23984

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

enforce import/embed case sensitivity #23163

enforce import/embed case sensitivity #23163

Uh oh!

marler8997 commented Mar 9, 2025

Uh oh!

squeek502 commented Mar 9, 2025

Uh oh!

Uh oh!

squeek502 Mar 10, 2025 •

edited

Loading

Uh oh!

marler8997 Mar 11, 2025 •

edited

Loading

Uh oh!

squeek502 Mar 11, 2025 •

edited

Loading

Uh oh!

marler8997 Mar 11, 2025 •

edited

Loading

Uh oh!

squeek502 Mar 11, 2025 •

edited

Loading

Uh oh!

squeek502 Mar 12, 2025 •

edited

Loading

Uh oh!

marler8997 Mar 12, 2025

Uh oh!

squeek502 Mar 12, 2025 •

edited

Loading

Uh oh!

marler8997 Mar 12, 2025 •

edited

Loading

Uh oh!

squeek502 Mar 11, 2025 •

edited

Loading

Uh oh!

marler8997 commented Apr 11, 2025

Uh oh!

Uh oh!

Uh oh!

enforce import/embed case sensitivity #23163

enforce import/embed case sensitivity #23163

Uh oh!

Conversation

marler8997 commented Mar 9, 2025

Uh oh!

squeek502 commented Mar 9, 2025

Uh oh!

Uh oh!

squeek502 Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

marler8997 Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

squeek502 Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

marler8997 Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

squeek502 Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

squeek502 Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

marler8997 Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

squeek502 Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

marler8997 Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

squeek502 Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

marler8997 commented Apr 11, 2025

Uh oh!

Uh oh!

squeek502 Mar 10, 2025 •

edited

Loading

marler8997 Mar 11, 2025 •

edited

Loading

squeek502 Mar 11, 2025 •

edited

Loading

marler8997 Mar 11, 2025 •

edited

Loading

squeek502 Mar 11, 2025 •

edited

Loading

squeek502 Mar 12, 2025 •

edited

Loading

squeek502 Mar 12, 2025 •

edited

Loading

marler8997 Mar 12, 2025 •

edited

Loading

squeek502 Mar 11, 2025 •

edited

Loading