Incorrect usage of mb_substr fails in PHP 8.3 on, resulting in e.g. corrupted invoice PDF files #522
Description
In CurlHttpClient.php::setIntuitResponse(), mb_substr is being used to parse the header and body out of the response. This is an INCORRECT usage of mb_substring that still worked pre-PHP 8.3.2, but will fail from 8.3.2 on, due to a change in behavior of mb_substr.
public function setIntuitResponse($response){
$headerSize = $this->basecURL->getInfo(CURLINFO_HEADER_SIZE);
$rawHeaders = mb_substr($response, 0, $headerSize);
$rawBody = mb_substr($response, $headerSize);
$httpStatusCode = $this->basecURL->getInfo(CURLINFO_HTTP_CODE);
$theIntuitResponse = new IntuitResponse($rawHeaders, $rawBody, $httpStatusCode, true);
$this->intuitResponse = $theIntuitResponse;
}
The problem: $response is NOT necessarily a valid UTF-8 string, it's raw binary data that could randomly contain data that looks like multi-byte characters, but are invalid UTF-8. This is the case, for example, when fetching an invoice PDF.
The old behavior of mb_substr would fall back to using substr if invalid characters were encountered, preserving the raw binary data, so coincidentally it still worked. From PHP 8.3.2 on, however, mb_substr is stricter. It returns "?" for invalid UTF-8 characters, corrupting the binary data and therefore the PDF file.
The solution is simple: the correct function to use is actually substr instead of mb_substr.
$rawHeaders = substr($response, 0, $headerSize);
$rawBody = substr($response, $headerSize);
This thread talks about the change in behavior pre PHP 8.3.2 in mb_substr. Modifying the code in my own local SDK fixed the issue with me receiving corrupted (blank white) PDFs. php/php-src#14703