Open
Description
Thinking about this some more, with the current dvc-bench architecture we can't actually separate push/pull for real clouds. The dataset has to be pushed to the real remote in order for it to be pulled in the first place, so separating them won't actually save us anything over the existing `test_sharing` workflow right now. We also have the overhead of needing to `dvc pull` the base dataset from the public bucket (using the default read-only HTTP remote and not S3) during the overall setup phase.
What we probably need to do set up buckets containing the mnist dataset for each cloud type we want to benchmark, and then have specific tests that only does a single pull
from the appropriate bucket, and a single push
to a temp directory in the appropriate bucket. This would need to be separate from the existing remote
and dataset
fixtures in dvc.testing.
Originally posted by @pmrowla in #9108 (comment)