Skip to content

fix: write binary response content as raw bytes (issue #242)#243

Merged
okigan merged 3 commits into
masterfrom
fix/binary-response-242
Jun 13, 2026
Merged

fix: write binary response content as raw bytes (issue #242)#243
okigan merged 3 commits into
masterfrom
fix/binary-response-242

Conversation

@okigan

@okigan okigan commented Jun 13, 2026

Copy link
Copy Markdown
Owner

Summary

Binary responses (gzip, images, PDFs, etc.) were corrupted when written via -o or printed to stdout, because response.text was used which decodes non-UTF-8 bytes as U+FFFD (ef bf bd). This makes it impossible to download binary files like gzip archives.

Changes

  • stdout: print(response.text)sys.stdout.buffer.write(response.content) — writes raw bytes
  • -o file: Always open("wb") + write(response.content) — no longer tied to --data-binary
  • --data-binary only affects request body hashing, not response encoding (its original purpose)

Behavior

Matches curl exactly — always writes raw bytes, no text/binary toggle.

Related

Testing

  • New TestBinaryResponseOutput class with 2 unit tests covering gzip magic bytes and binary content preservation
  • All 16 unit tests pass
  • PEP8 linting clean

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d94cbba329

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/unit_test.py Outdated
"""

@patch('awscurl.awscurl.__send_request')
@patch('builtins.open', new_callable=lambda: __builtins__['open'])

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove the broken open patch from the new test

In the CI matrix's normal Python 3.10+ module context, __builtins__ is a module rather than a dict, so this new_callable raises TypeError: 'module' object is not subscriptable as soon as the test's patch context is entered. That makes the newly added pytest case fail before it exercises the response handling; this patch is also unnecessary because the test never calls open.

Useful? React with 👍 / 👎.

Comment thread awscurl/awscurl.py

print(response.text)
# Write response body to stdout as raw bytes (matching curl behavior)
sys.stdout.buffer.write(response.content)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Flush text headers before writing raw body bytes

When --include is used with stdout redirected or piped, the header line is written through sys.stdout's text buffer but the body now bypasses it via sys.stdout.buffer; because the print(response.headers, ...) above is not flushed, Python can emit the raw body before the headers. This regresses the combined header/body output ordering that previously stayed within the text stream, so flush sys.stdout before writing response.content or write both parts through the same binary path.

Useful? React with 👍 / 👎.

Always write response.content as raw bytes to both stdout and -o file,
matching curl's behavior. Previously response.text was used which decoded
non-UTF-8 bytes as U+FFFD, corrupting binary files (gzip, images, PDFs).

The --data-binary flag was incorrectly controlling response encoding — it
only affects request body hashing, not response output.

Signed-off-by: igor <igor@earendil.works>
@okigan okigan force-pushed the fix/binary-response-242 branch from d94cbba to 1871ba2 Compare June 13, 2026 00:58
okigan added 2 commits June 12, 2026 20:04
Add # type: ignore[assignment] for pre-existing CaseInsensitiveDict
reassignment patterns that were flagged by mypy after version update.

Also fix type ignore for resp.headers = {} in test mocks.

These are pre-existing issues in the repo, not introduced by the
binary response fix.
… (PR #243)

- Remove unnecessary open patch from test_binary_response_content_preserved_in_output
  that crashes on Python 3.10+ (__builtins__ is a module, not a dict).
  The patch is unused since the test never calls open.

- Flush sys.stdout before writing response.content to stdout.buffer.
  This ensures text-mode output (--include headers) is written before
  the binary body, matching curl's ordering guarantees.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot GET binary file from REST API Gateway in any way output option errors with text response

1 participant