Skip to content

benchmarks: rework real remote tests + fixtures #9495

Open
@pmrowla

Description

@pmrowla
          Thinking about this some more, with the current dvc-bench architecture we can't actually separate push/pull for real clouds. The dataset has to be pushed to the real remote in order for it to be pulled in the first place, so separating them won't actually save us anything over the existing `test_sharing` workflow right now. We also have the overhead of needing to `dvc pull` the base dataset from the public bucket (using the default read-only HTTP remote and not S3) during the overall setup phase.

What we probably need to do set up buckets containing the mnist dataset for each cloud type we want to benchmark, and then have specific tests that only does a single pull from the appropriate bucket, and a single push to a temp directory in the appropriate bucket. This would need to be separate from the existing remote and dataset fixtures in dvc.testing.

Originally posted by @pmrowla in #9108 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    p3-nice-to-haveIt should be done this or next sprinttestingRelated to the tests and the testing infrastructure

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions