fix(lambda-edge): classify ZIP-based document subtypes as binary by rokasta12 · Pull Request #4900 · honojs/hono

rokasta12 · 2026-05-03T11:09:42Z

Problem

isContentTypeBinary in src/adapter/lambda-edge/handler.ts used the regex application/(.*json|.*xml).*, which matches any subtype that contains the substring json or xml anywhere — including ZIP archives whose subtypes happen to embed those letters:

Content-Type	Actual format	Currently classified
`application/vnd.openxmlformats-officedocument.wordprocessingml.document` (.docx)	ZIP	non-binary ❌
`application/vnd.openxmlformats-officedocument.spreadsheetml.sheet` (.xlsx)	ZIP	non-binary ❌
`application/vnd.openxmlformats-officedocument.presentationml.presentation` (.pptx)	ZIP	non-binary ❌
`application/epub+zip` (.epub)	ZIP	non-binary ❌
`application/vnd.oasis.opendocument.text` (.odt)	ZIP	non-binary ❌

When isContentTypeBinary returns false, the response body is returned to CloudFront un-base64-encoded, so any user serving these document types from a Lambda@Edge handler delivers a corrupted file.

Fix

Replace the regex with the boundary-aware pattern already used by the aws-lambda adapter (src/adapter/aws-lambda/handler.ts:675):

^text\/(?:plain|html|css|javascript|csv)|(?:\/|\+)(?:json|xml)\s*(?:;|$)

This accepts:

text/(plain|html|css|javascript|csv) at the start
/json, /xml, +json, +xml followed by ; or end-of-string

So application/atom+xml, application/ld+json, and image/svg+xml continue to be treated as text, while OOXML/EPUB/ODT subtypes are correctly classified as binary.

Test

Added a regression test covering the OOXML triple, application/epub+zip, and application/vnd.oasis.opendocument.text alongside the existing assertions. The new test fails on main and passes with the regex fix.

Add tests
Run tests (bun run test — passes)
bun run format:fix && bun run lint:fix
Add TSDoc/JSDoc — no API surface change

The previous regex `application/(.*json|.*xml).*` matched any subtype containing the substrings "json" or "xml" anywhere, including `application/vnd.openxmlformats-officedocument.*` (.docx/.xlsx/.pptx), `application/epub+zip`, and `application/vnd.oasis.opendocument.*`. These are ZIP archives. Treating them as text caused the body to be returned un-base64-encoded, so CloudFront delivered corrupted files to clients. Replaces the regex with the boundary-aware pattern already used by the aws-lambda adapter: it accepts `text/(plain|html|css|javascript|csv)` at the start, or `/json`, `/xml`, `+json`, `+xml` followed by `;` or end-of-string. This keeps `application/atom+xml`, `application/ld+json`, and `image/svg+xml` as text while correctly rejecting OOXML/EPUB/ODT.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(lambda-edge): classify ZIP-based document subtypes as binary#4900

fix(lambda-edge): classify ZIP-based document subtypes as binary#4900
rokasta12 wants to merge 1 commit into
honojs:mainfrom
rokasta12:fix/lambda-edge-binary-mime

rokasta12 commented May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

rokasta12 commented May 3, 2026

Problem

Fix

Test

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant