-
-
Notifications
You must be signed in to change notification settings - Fork 868
Description
- Odin Version: 2025-12
The function utf8_to_wstring_buf from core/sys/windows does not null-terminate the result (also affects utf8_to_utf16 - but in that case it might be intentional).
Also affects _alloc variant, but that is only relevant when non-zeroed memory is returned by the used allocator.
Expected Behavior
The resulting string written to the specified []u16 buffer should be null-terminated.
Whether this is expected could be up for debate since I couldn't find something authoritative. However, from what I could read between the lines when I researched this briefly, the utf8_to_wstring procedure group appears to be the intended way to interoperate Odin strings with the win32 API (LP[C]WSTR). It would be strange if this omission was intentional since the vast majority of string related functions in the win32 API expect a null terminated string.
It could be that the function expects the buffer to be zeroed by the caller, but I have found no evidence that that was the intention (nor would I agree that that is a reasonable API contract), so I concluded that this is most likely an oversight.
Current Behavior
String is not null-terminated after conversion. I.e., no explicit zero is written to the u16 element at the end of the string in the buffer. This only happens to work if the buffer is zeroed before calling the function.
Failure Information (for bugs)
This is because of how MultiByteToWideChar is used in this procedure group. When used with an explicit length argument, the function does not null-termiante the result (as documented).
In C, this usually doesn't happen because the most frequent use case is to pass in a null-terminated string, let the function infer the length by passing -1, and in that case specifically, the function will ensure that the result is null-terminated.
Steps to Reproduce
buf : [20]u16
for i in 0 ..< len(buf) {
buf[i] = auto_cast i + 0x20
}
s := "Hello"
converted := win.utf8_to_wstring(buf[:], s)
The above code results in a string that is not properly zero-terminated.
I could implement a fix if wanted, as soon as it's clear what the intended behavior is (of utf8_to_utf16 and utf8_to_wstring respectively).