Stream POST request in order to handle large files #161
+104
−20
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The bug
I ran into a problem last year: when I tried to create or synchronize a challenge containing a large file (i.e. a forensics challenge with a 15 GB disk image), the entire file was put into memory before starting the request
This causes crashes since I only have 16GB of RAM in my computer.
The cause
Although the
requests
module supports body streaming when you pass a file pointer to thedata
parameter, it is not capable of streaming form-data.When the
requests
module prepares the headers, it tries to calculate theContent-Length
. As a result, the entire body will be stored in memory.The fix
One solution would be to switch to another HTTP client, capable of streaming form-data.
I chose to modify as little code as possible. I made the choice to delegate the body encoding to the
MultipartEncoder
from therequests-toolbelt
module. This requires a few modifications to theAPI
class, since theMultipartEncoder
takes parameters differently fromrequests
.As a result files must be sent with a filename hint: