<locale>
: std::collate<_Elem>::do_transform()
should behave appropriately when _LStrxfrm()
fails #5210
Description
This is mainly an annoyance I noticed while preparing #5209, but there are two related problems here:
std::collate<char>
When _Strxfrm()
fails, it returns -1
(SIZE_MAX
) as an error code.std::collate<char>::do_transform()
passes this return value to basic_string<char>::resize()
, which throws a length_error("string too long")
.
This is misleading, since the problem isn't that the result can't be represented in memory, but that there is no result at all because no sort key could be generated by _Strxfrm()
. We should check whether the return value of _Strxfrm()
indicates an error and then handle this appropriately (either by throwing an appropriate exception or returning a substitute key).
std::collate<wchar_t>
When _Wcsxfrm()
fails (due to a failure in LCMapStringW
), it returns INT_MAX
as an error code. std::collate<char>::do_transform()
passes this return value to basic_string<char>::resize()
, which is likely to succeed on x64. If so, it calls _Wcsxfrm()
again, which returns INT_MAX
, but because the string size equals INT_MAX
, this is considered successful and the contents of the string are returned. However, the contents of the returned string are not guaranteed, so could be garbage.
Additional remarks
It seems that _Wcsxfrm()
uses two error codes: SIZE_MAX
when allocation fails and INT_MAX
when LCMapStringW
fails (e.g., because of encoding issues). I doubt that this is intentional and that it should have always returned SIZE_MAX
.
While _Strxfrm()
always and _Wcsxfrm()
sometimes return -1
(SIZE_MAX
) on error, the comments above the functions actually claim that it returns INT_MAX
to designate failure:
Line 53 in 9082000