Enhancement
FileResource.read() doesn't specify an encoding when calling read_text(), so it falls back to the system locale. On Windows (where the default is often cp1252), this causes a UnicodeDecodeError when reading UTF-8 files that contain non-ASCII characters like smart quotes or em dashes.
Use case: Registering a UTF-8 markdown file as a FileResource with mime_type="text/markdown" on a Windows machine with a cp1252 locale.
Expected: The file is read successfully.
Actual:
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 73832:
character maps to <undefined>
The byte 0x9d is part of the UTF-8 sequence for U+201D (RIGHT DOUBLE QUOTATION MARK). The file is valid UTF-8, but the system codec can't decode it.
Enhancement
FileResource.read()doesn't specify an encoding when callingread_text(), so it falls back to the system locale. On Windows (where the default is oftencp1252), this causes aUnicodeDecodeErrorwhen reading UTF-8 files that contain non-ASCII characters like smart quotes or em dashes.Use case: Registering a UTF-8 markdown file as a
FileResourcewithmime_type="text/markdown"on a Windows machine with acp1252locale.Expected: The file is read successfully.
Actual:
The byte
0x9dis part of the UTF-8 sequence for U+201D (RIGHT DOUBLE QUOTATION MARK). The file is valid UTF-8, but the system codec can't decode it.