Skip to content

TUS upload support (WIP) #65

@felix-schwarz

Description

@felix-schwarz

The SDK should provide support for uploading files via the TUS protocol.

Notable observations from reading the spec:

1. Background NSURLSession friendly

Requests / PATCH

The Server SHOULD always attempt to store as much of the received data as possible.

If the server does store as much of the received data as possible, the SDK has an easier time to comply with requirements for NSURLSession background queues and avoiding penalties:

  • it can send the entire file in one request
  • if the connection is interrupted:
    • the server (ideally) keeps the received data
    • the client uses a HEAD request to retrieve the offset to resume from
    • it sends the rest of the file from that offset in a single request
  • this way, consecutive requests in close timely proximity - something that is penalized by NSURLSession with long delays - can be mostly avoided

2. Schedulable

Protocol Extensions / Creation

The Client and the Server SHOULD implement the upload creation extension.

It would be preferable if clients were able to directly start an upload with a single request:

  • upload can be scheduled in a background NSURLSession as a single request if Creation With Upload extension is implemented
  • plain Creation extension would require additional request to "register" the upload first:
    • requires the app to be active again before the actual upload can be scheduled
    • could be seen as pattern that's penalized by NSURLSession with long delays

3. Defined expiration

Protocol Extensions / Upload-Expires

The `Upload-Expires response header indicates the time after which the unfinished upload expires. A Server MAY wish to remove incomplete uploads after a given period of time to prevent abandoned uploads from taking up extra storage. The Client SHOULD use this header to determine if an upload is still valid before attempting to resume the upload.

This helps in determining whether an upload should be continued or not - and resume only uploads that are known to still be around. On the other hand, a HEAD request would be required anyway before resuming an upload, at which point an expired upload should also become apparent.

If the expiration date should be supported and utilized, though, adding support for expiration directly to the OCHTTP system should be considered, with requests being terminated with a new error code in case they have expired before having been scheduled.

4. Checksum troubles

The Checksum extension needs checksums to be provided on a per-request basis, calculated not over the entire file but over the body of the respective upload requests.

This is generally fine, but does not cover the scenario where an upload is interrupted and the server should use the already received bytes:

  • the checksum over the partial upload will differ:
    • will the server reject all data received with that request over this?
    • if the server accepts the data, how can overall file consistency be verified

Possible solutions:

1. Store checksums, check when upload is complete

The specification provides this hint:

Once the entire request has been received, the Server MUST verify the uploaded chunk against the provided checksum using the specified algorithm.

The solution therefore could be to store all checksums and only verify them against the respective parts once they have been received in full.

Drawback:

  • if an upload is resumed, the ranges to compute checksums on could overlap, generating unnecessary load.

2. Custom header with the full file checksum

A custom header with the full file checksum (f.ex. OC-Full-Upload-Checksum) is passed to the Create With Upload extension when the upload is initiated. That would allow verification of the full file once the upload has completed.

Drawback:

  • a corrupted upload will only be detected once the entire file has been transfered

3. Custom header with the checksum over already transferred data

An additional, custom header with the checksum of the file up to the point the upload resumes from (f.ex. OC-Transmitted-Upload-Checksum) would allow the server to check if the data received before is consistent - and allow the server to cancel an already.

Drawback:

  • parts of the file could be checksummed multiple times, generating unnecessary load

Drawback mitigation:

  • that drawback could be mitigated by the server returning a byte range with HEAD requests, for which the client should provide a checksum when resuming the upload – allowing the server to accept partial requests while ensuring consistency

Pragmatic and performant

A pragmatic and performance-oriented approach would likely by a combination of 1. and 3.

Related issues

Known issues

The current implementation in develop has the following known issues:

  • uploads may sometimes not resume and "hang" indefinitely if the app is force-terminated. To test resuming, log out and back into an account.
  • no progress is reported for uploads

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions