For very large file transfers between services in the same Docker Compose environment, using REST to push the entire payload is often the least efficient option.
If both containers run on the same host (or same Docker network), sending multi-GB files over HTTP means:
- serializing/deserializing request bodies
- copying data through HTTP stacks
- buffering
- potential reverse proxy limits/timeouts
- retry complexity on partial transfers
A shared volume + lightweight REST/events is often significantly faster and simpler for bulk data exchange.
Option 1: Keep full file transfer over REST
Pros
- Simple mental model: one API call does everything.
- Strong request/response semantics.
- Easier authentication/authorization boundaries.
- Language/framework agnostic.
- Easy remote extension later (if services move to separate hosts).
- Can stream data (
chunked, multipart, resumable uploads).
Cons
-
Extra copies of data in memory/kernel/network stack.
-
HTTP overhead (headers, parsing, TLS if enabled).
-
Large request buffering depending on framework/proxy.
-
Timeouts:
- client timeout
- reverse proxy timeout
- idle timeout
-
Harder resumability after failure unless explicitly implemented.
-
Can fill logs/metrics/tracing systems unexpectedly.
-
Higher CPU usage.
-
More disk I/O if temporary upload storage is used.
-
Large files can break load balancers/proxies (e.g. Nginx body limits).
When REST is still fine
- Files <100–500 MB
- Infrequent transfers
- Need external compatibility
- Already using streaming APIs properly
Option 2: Shared directory/volume + REST for metadata/triggers
Pattern:
- Producer writes file to shared volume
- Producer calls REST:
POST /process {path:"/shared/job123/file.bin"}
- Consumer reads file directly
- Consumer responds/status updates
This is usually the best choice for your case.
Example Docker Compose:
services:
producer:
volumes:
- shared-data:/shared
consumer:
volumes:
- shared-data:/shared
volumes:
shared-data:
Pros
Performance
- Usually fastest on single host
- No network transfer of actual file
- Zero/minimal serialization
- OS filesystem caching helps
- Lower CPU
Reliability
- File persists independently of service restarts
- Consumer can retry reading later
- Easier resume/recovery
Simplicity for huge files
- No multipart upload complexity
- No HTTP size limits
- No proxy issues
Decoupling
- Producer and consumer can operate asynchronously.
Cons
Shared-state complexity
Now you manage:
- file naming
- lifecycle
- cleanup
- retention
- versioning
Without discipline, shared folders become digital attics full of forgotten gigabytes.
Race conditions
Consumer may read file before producer is finished.
Need strategies:
- temp file then atomic rename
- lock files
- completion marker
Example:
file.tmp
mv file.tmp file.done
or
Security/isolation
Shared volume weakens service boundaries:
- accidental overwrite
- unauthorized reads
Need permissions/read-only mounts where possible.
Harder horizontal scaling
Works great on one host.
Gets harder when:
- multiple hosts
- Kubernetes
- cloud autoscaling
Then you need distributed/shared storage.
Cleanup required
Need janitor process/TTL cleanup.
Recommended improvements if using shared directory
Use workflow like:
write -> fsync -> atomic rename -> notify
Detailed:
- write
/shared/job42/output.part
- close + fsync
- rename to
/shared/job42/output.dat
- REST call: "job42 ready"
Atomic rename avoids half-written reads.
Add metadata:
{
"job_id": "42",
"path": "/shared/job42/output.dat",
"checksum": "sha256:..."
}
Consumer validates checksum.
Alternative 3: Shared volume + message queue (better than REST triggers)
Instead of REST triggers:
- write file to shared volume
- send message to queue
Tools:
- RabbitMQ
- Apache Kafka
- Redis streams/pubsub
Flow:
Producer -> shared file
Producer -> queue message
Consumer -> receives event -> reads file
Pros
- asynchronous
- retries
- backpressure
- dead letter queues
- less coupling than REST
Cons
- extra infrastructure
- operational complexity
Best for many jobs/high throughput.
Decision matrix
| Approach |
Speed |
Complexity |
Scalability |
Reliability |
| REST full transfer |
Low–Medium |
Low |
High |
Medium |
| Shared volume + REST |
Very High |
Medium |
Low–Medium |
High |
| Shared volume + Queue |
Very High |
Medium–High |
Medium |
High |
My recommendation for your setup
Since already exists:
- same Docker Compose environment
- huge files
- already have REST for commands
Best architecture is probably:
Shared Docker volume for file payloads
+
REST (or queue) for commands/status/events
So:
Service A:
writes /shared/job123/input.dat
POST /jobs/job123/start
Service B:
reads /shared/job123/input.dat
processes
writes /shared/job123/result.dat
POST /jobs/job123/done
This gives:
- fastest transfer
- minimal code changes
- keeps control plane in REST
- data plane via filesystem
A nice split: REST for intentions, filesystem for bulk data.
That’s usually the sweet spot.
For very large file transfers between services in the same Docker Compose environment, using REST to push the entire payload is often the least efficient option.
If both containers run on the same host (or same Docker network), sending multi-GB files over HTTP means:
A shared volume + lightweight REST/events is often significantly faster and simpler for bulk data exchange.
Option 1: Keep full file transfer over REST
Pros
chunked, multipart, resumable uploads).Cons
Extra copies of data in memory/kernel/network stack.
HTTP overhead (headers, parsing, TLS if enabled).
Large request buffering depending on framework/proxy.
Timeouts:
Harder resumability after failure unless explicitly implemented.
Can fill logs/metrics/tracing systems unexpectedly.
Higher CPU usage.
More disk I/O if temporary upload storage is used.
Large files can break load balancers/proxies (e.g. Nginx body limits).
When REST is still fine
Option 2: Shared directory/volume + REST for metadata/triggers
Pattern:
POST /process {path:"/shared/job123/file.bin"}This is usually the best choice for your case.
Example Docker Compose:
Pros
Performance
Reliability
Simplicity for huge files
Decoupling
Cons
Shared-state complexity
Now you manage:
Without discipline, shared folders become digital attics full of forgotten gigabytes.
Race conditions
Consumer may read file before producer is finished.
Need strategies:
Example:
or
Security/isolation
Shared volume weakens service boundaries:
Need permissions/read-only mounts where possible.
Harder horizontal scaling
Works great on one host.
Gets harder when:
Then you need distributed/shared storage.
Cleanup required
Need janitor process/TTL cleanup.
Recommended improvements if using shared directory
Use workflow like:
Detailed:
/shared/job42/output.part/shared/job42/output.datAtomic rename avoids half-written reads.
Add metadata:
{ "job_id": "42", "path": "/shared/job42/output.dat", "checksum": "sha256:..." }Consumer validates checksum.
Alternative 3: Shared volume + message queue (better than REST triggers)
Instead of REST triggers:
Tools:
Flow:
Pros
Cons
Best for many jobs/high throughput.
Decision matrix
My recommendation for your setup
Since already exists:
Best architecture is probably:
So:
This gives:
A nice split: REST for intentions, filesystem for bulk data.
That’s usually the sweet spot.