fix(deepagents): add default chunked upload/download to BaseSandbox by AI-Guru · Pull Request #1402 · langchain-ai/deepagents

Dr. Tristan Behrens (AI-Guru) · 2026-02-18T18:47:58Z

Summary

Add concrete upload_files() and download_files() implementations to BaseSandbox, replacing the current @abstractmethod stubs
Small files (<64KB) use a single execute() call (same as before)
Large files are split into 64KB base64 chunks to avoid hitting the Linux kernel ARG_MAX limit (~128KB per argument)
Downloads also chunk to avoid execute() output truncation
Methods are non-abstract so backends with native file transfer (SSH/SFTP, Daytona REST) can still override them

Problem

When using sandbox backends that route file transfers through execute() (Docker with tmpfs, microsandbox), uploading binary files larger than ~100KB fails with:

exec /bin/bash: argument list too long

This happens because the base64-encoded file content is embedded in the command string passed to exec_run(), which exceeds the kernel's ARG_MAX limit. Docker's put_archive API cannot be used as a workaround because tmpfs mounts are invisible to it.

The error is particularly problematic for autonomous workflows that generate binary artifacts (PDFs, images) inside sandboxed containers.

Solution

BaseSandbox already provides default implementations of read(), write(), edit(), ls_info(), glob_info(), and grep_raw() via execute(). This PR extends that pattern to upload_files() and download_files():

Upload: Files over 64KB are base64-encoded, split into chunks, each chunk appended to a temp file inside the sandbox via separate execute() calls, then the assembled base64 is decoded to the final path
Download: Files over 64KB are read in binary chunks inside the sandbox, each chunk base64-encoded and returned via separate execute() calls, then reassembled on the host

All operations go through execute(), respecting the full middleware chain.

Test plan

Unit tests: 6 new tests covering small file upload, large file chunking, upload error handling, small file download, large file chunked download, and missing file error
All 15 sandbox backend tests pass
Verified end-to-end with Docker backend: 301KB PDF upload/download roundtrip succeeds
Verified 1MB random binary roundtrip with byte-level fidelity

🤖 Generated with Claude Code

BaseSandbox already provides default implementations of read(), write(), edit(), ls_info(), glob_info(), and grep_raw() via execute(). However, upload_files() and download_files() are abstract, forcing every sandbox backend to reimplement the same base64+heredoc pattern. More importantly, the common pattern of embedding the entire base64 payload in the command string hits the Linux kernel ARG_MAX limit (~128KB per argument) for files larger than ~100KB. This causes "argument list too long" errors when uploading binary files like PDFs to Docker containers with tmpfs mounts (where put_archive doesn't work). This commit: - Adds concrete upload_files() and download_files() to BaseSandbox - Small files (<64KB) use a single execute() call (no change) - Large files are split into 64KB base64 chunks, each written via a separate execute() call, then decoded in the sandbox - Downloads also chunk large files to avoid execute() output truncation - Methods are non-abstract so backends with native file transfer (e.g. SSH/SFTP, Daytona REST) can still override them - Adds 6 unit tests covering small/large/error paths for both operations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- SIM108: use ternary for upload/download strategy selection - Q003: use single outer quotes to avoid escaping inner double quotes - BLE001: catch (ValueError, binascii.Error) instead of blind Exception - F401: remove unused FileDownloadResponse/FileUploadResponse imports Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Eugene Yurtsev (eyurtsev) · 2026-02-18T19:26:30Z

libs/deepagents/deepagents/backends/sandbox.py

+
+        return responses
+
+    def _upload_single(self, file_path: str, b64: str) -> ExecuteResponse:


If you're interested in working on this feature, could you try to match the style of the other provided implementations (e.,g., write_file etc)?

offload as much as you can into the template which attempts to make it easier to see the boundary between python and shell code (makes it easier to verify that the implementation does not security issues)

Make sure that everything that needs to be escaped is escaped properly so there's no additional attack surface

Done! Refactored all upload/download methods to use module-level _COMMAND_TEMPLATE constants matching the existing _WRITE_COMMAND_TEMPLATE / _EDIT_COMMAND_TEMPLATE pattern.

Seven new templates added: _UPLOAD_COMMAND_TEMPLATE, _UPLOAD_CHUNK_COMMAND_TEMPLATE, _UPLOAD_DECODE_COMMAND_TEMPLATE, _REMOVE_COMMAND_TEMPLATE, _DOWNLOAD_SIZE_COMMAND_TEMPLATE, _DOWNLOAD_COMMAND_TEMPLATE, _DOWNLOAD_CHUNK_COMMAND_TEMPLATE.

All file paths are now base64-encoded before interpolation (safe charset [A-Za-z0-9+/=]), which also eliminates the shell injection risk from the previous string concatenation approach. Template format tests added for each.

Eugene Yurtsev (eyurtsev) · 2026-02-18T19:28:24Z

libs/deepagents/deepagents/backends/sandbox.py

+
+        Args:
+            file_path: Absolute path inside the sandbox.
+            b64: Base64-encoded file content (must fit within ARG_MAX).


is ARG_MAX relevant if we're feeding through stdin?

Good question! ARG_MAX itself (typically ~2MB) is the total size limit for all arguments + environment passed to execve(). But the more relevant constraint here is MAX_ARG_STRLEN (typically PAGE_SIZE * 32 = 128KB on Linux), which limits any single argument string.

When we use bash -c 'script<<heredoc', the entire string — including the heredoc content — is passed as a single argument to execve(). So heredocs via bash -c don't bypass the limit the way a standalone heredoc in an interactive shell would (where bash itself reads from stdin).

With 64KB raw chunks → ~87KB base64, we stay safely under the 128KB MAX_ARG_STRLEN per-argument limit.

I've updated the comments in the code to reference MAX_ARG_STRLEN instead of ARG_MAX to be more precise about which kernel limit actually applies.

Replace string concatenation in upload/download methods with module-level _COMMAND_TEMPLATE constants, matching the existing pattern used by _WRITE_COMMAND_TEMPLATE and _EDIT_COMMAND_TEMPLATE. All file paths are now base64-encoded before interpolation, eliminating shell injection risk from path concatenation. Seven new templates added: - _UPLOAD_COMMAND_TEMPLATE (small file, heredoc stdin) - _UPLOAD_CHUNK_COMMAND_TEMPLATE (append chunk to temp file) - _UPLOAD_DECODE_COMMAND_TEMPLATE (decode assembled base64) - _REMOVE_COMMAND_TEMPLATE (cleanup helper) - _DOWNLOAD_SIZE_COMMAND_TEMPLATE (get file size) - _DOWNLOAD_COMMAND_TEMPLATE (small file download) - _DOWNLOAD_CHUNK_COMMAND_TEMPLATE (chunked download) Also updates ARG_MAX comments to reference MAX_ARG_STRLEN (128KB per-argument limit on Linux) which is the actual constraint when heredocs are used inside bash -c. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions bot added external User is not a member of the `langchain-ai` GitHub organization deepagents Related to the `deepagents` SDK / agent harness and removed external User is not a member of the `langchain-ai` GitHub organization labels Feb 18, 2026

Dr. Tristan Behrens (AI-Guru) changed the title ~~Add default chunked upload/download to BaseSandbox~~ fix(deepagents): add default chunked upload/download to BaseSandbox Feb 18, 2026

github-actions bot added the fix A bug fix (PATCH) label Feb 18, 2026

Dr. Tristan Behrens (AI-Guru) and others added 2 commits February 18, 2026 19:53

style: apply ruff format to new upload/download methods

7b81cea

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Eugene Yurtsev (eyurtsev) reviewed Feb 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(deepagents): add default chunked upload/download to BaseSandbox#1402

fix(deepagents): add default chunked upload/download to BaseSandbox#1402
Dr. Tristan Behrens (AI-Guru) wants to merge 4 commits intolangchain-ai:mainfrom
AI-Guru:fix/chunked-file-transfer

Dr. Tristan Behrens (AI-Guru) commented Feb 18, 2026

Uh oh!

Eugene Yurtsev (eyurtsev) Feb 18, 2026 •

edited

Loading

Uh oh!

Dr. Tristan Behrens (AI-Guru) Feb 19, 2026

Uh oh!

Eugene Yurtsev (eyurtsev) Feb 18, 2026

Uh oh!

Dr. Tristan Behrens (AI-Guru) Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		return responses

		def _upload_single(self, file_path: str, b64: str) -> ExecuteResponse:

Conversation

Dr. Tristan Behrens (AI-Guru) commented Feb 18, 2026

Summary

Problem

Solution

Test plan

Uh oh!

Eugene Yurtsev (eyurtsev) Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Dr. Tristan Behrens (AI-Guru) Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Eugene Yurtsev (eyurtsev) Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

Dr. Tristan Behrens (AI-Guru) Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Eugene Yurtsev (eyurtsev) Feb 18, 2026 •

edited

Loading