Open
Description
Some code in IronRDP is incorrectly computing the size of UTF-16 strings.
Example:
// This is not the right way to compute the number of bytes for unicode strings encoded in UTF-16.
// This is a time bomb: it will returns the correct result some times (e.g.: when the string is valid ASCII),
// but not always.
fn utf16_len(utf8_str: &str) -> usize {
utf8_str.len() * 2
}
Both UTF-8 and UTF-16 are using a variable-length encoding and code points may be encoded using multiple code units. The thing is, UTF-16 uses one or two 16-bit code units and UTF-8 uses between one and four 8-bit code units. It’s really not always the case that a code point in UTF-16 is twice as big as the same code point in UTF-8.
This kind of erroneous code is present at multiple places. One such instance is ironrdp_pdu::rdp::client_info::string_len
.
Instead, something like that must be used:
utf8_str.encode_utf16().count() * 2 // add 2 if we need to account for a null terminator (0x0000)
Refer to ironrdp_pdu::pcb
module for a correct implementation.