Skip to content

Conversation

Tronic
Copy link
Member

@Tronic Tronic commented Jun 25, 2025

Refactor content type handling to automatically append charset=UTF-8 to text/* MIME types when serving static files and file responses.

  • Add new guess_content_type() utility function that wraps mimetypes.guess_type()
  • Automatically append '; charset=UTF-8' to text content types
  • Replace direct mimetypes.guess_type() usage with new utility
  • Update static file serving and file() response functions to use new utility

Fix #2987

Tronic added 2 commits June 25, 2025 14:45
Refactor content type handling to automatically append charset=UTF-8
to text/* MIME types when serving static files and file responses.

- Add new guess_content_type() utility function that wraps mimetypes.guess_type()
- Automatically append '; charset=UTF-8' to text content types
- Replace direct mimetypes.guess_type() usage with new utility
- Update static file serving and file() response functions to use new utility
@Tronic Tronic requested a review from a team as a code owner June 25, 2025 20:51
@Tronic
Copy link
Member Author

Tronic commented Jun 25, 2025

Looks like tests are (still) broken. I cannot run them on my own box either without plenty of (unrelated) failures. Further work may be required on the PR, in particular with any existing tests that could've been broken by the change.

@Tronic
Copy link
Member Author

Tronic commented Jun 26, 2025

I manually ran only the relevant tests because so many things fail with Sanic test server "address already in use" making the full test results unmanageable. Fixed a few issues and optimized the tests run a bit faster avoiding overlap.

I note that file() should probably default to HTTP default (i.e. application/octet-stream), rather than text/plain as it was hardcoded to do. In particular this affects if some arbitrary format data files are sent (e.g. test.dat would be responded as plain text), although one can see the reasoning to send a bare README or such (legacy) files as text so it can be displayed in browsers rather than downloaded. This however is in conflict with how Sanic behaves for the static file handler.

For the sake of maintaining compatibility with prior versions, this PR does not change the file() fallback type, other than by appending to the charset to it as well. app.static() just as before defaults to HTTP default, which has no charset.

…() interface as it is related. Removed duplicate tests, clarified naming.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Content type on static files lacks charset=UTF-8

1 participant