Open
Description
Summary
An upload equivalent of dvc get-url
.
We currently use get-url
as a cross-platform replacement for wget
. However, together with get-url
, put-url
will turn DVC into a replacement for rsync
/rclone
.
Motivation
- we already have
get-url
so addingput-url
seems natural for the same reasons put-url
will be used by- CML internally to sync data
- LDB internally to sync data
- the rest of the world
- uses existing functionality of DVC so should be fairly quick to expose
- cross-platform multi-cloud replacement for
rsync
/rclone
. What's not to love?- could even create a spin-off thin wrapper (or even abstract the functionality) in a separate Python package
Detailed Design
usage: dvc put-url [-h] [-q | -v] [-j <number>] url targets [targets ...]
Upload or copy files to URL.
Documentation: <https://man.dvc.org/put-url>
positional arguments:
url Destination path to put data to.
See `dvc import-url -h` for full list of supported
URLs.
targets Files/directories to upload.
optional arguments:
-h, --help show this help message and exit
-q, --quiet Be quiet.
-v, --verbose Be verbose.
-j <number>, --jobs <number>
Number of jobs to run simultaneously. The default
value is 4 * cpu_count(). For SSH remotes, the default
is 4.
How We Teach This
- Name:
put-url
seems to be in line with the existingget-url
(vis. HTTPGET
&PUT
) - Idea presentation: continuation of existing DVC patterns
- Docs: simply add https://dvc.org/doc/command-reference/put-url largely based off the existing https://dvc.org/doc/command-reference/get-url
- Teaching: not required
Drawbacks
- can't think of any
Alternatives
- would have to re-implement per-cloud sync options for CML & other products
Unresolved Questions
- minor implementation details
- CLI naming (
put-url
)? - CLI argument order (
url targets [targets...]
)? - Python API (
dvc.api.put_url()
)?
- CLI naming (
Please do assign me if happy with the proposal.
(dvc get-url
+ put-url
= dvc rsync
:))