Skip to content

Using dvc exp workflow together with MR #9102

Open
@aguschin

Description

@aguschin

We'd like to understand how would a MR workflow look like considering dvc exp approach to experiment management. We need to make dvc exp + MR experience feel natural to developers 🙌🏻 Here're my thought, let's discuss.

This is almost the same as if you would branch out and run an experiment without DVC (although we may make UX better by letting dvc exp manage type: model somehow).

So, if you have a separate "feature" branch with experiment.

A. You may want to first assess metrics/reports, then decide to merge branch, then register a Semver version/promote to stage.

  1. if that's a manual process, you should be ok using GTO and Studio MR for that
  2. if that's automated, you'd be setting up a CI that run some checks (optionally) and registers/promotes
  3. as a special case of the previous one, if you're promoting each commit merged to main, you're either:
    i. fine without any promotions/registrations (cause answer to "what's in prod?" will always be "what's in HEAD of main"),
    ii. or you need to register/promote, but if you have more than a single model in repo, you could do that in CI (run CI on merge into main to check what type: model files are updated, register a new version)

B. You may want to deploy the model to preview env before merging (can do register/promote afterwards as well):

  1. this may happen without assigning stage (you have CI running on each new commit to the feature branch, that re-deploys the model and run some tests - similar to what happens in dvc.org with feature branches). Then you probably don't need stage assignments, since HEAD of your feature branch is what's assigned to preview
  2. this may happen manually - if you don't have a automatic re-deploy on each commit, but want to trigger that manually. Then we may suggest registering a new version and promoting it. After merge, the history will keep this and it won't be garbage collected, which is fine I assume. The tricky part is that rn you can't register the same model Semver twice for different commits. Which means, if you registered few versions in your feature branch (v1.0.0, v1.0.1, v1.0.3), you can't register the same version once you merge the branch. The limitation seems reasonable to me since the last commit in feature branch may not have any specific version registered at all. Instead of considering this a GTO limitation and allowing to register same versions in different commits, i'd suggest thinking that this is a Git limitation that needs to be taken into account. The suggested way to work here would be instead registering "candidate" versions like (v1.0.0-rc0, v1.0.0-rc1, v1.0.0-rc2) as Semver standard assumes.

cc @dberenbaum

Metadata

Metadata

Assignees

No one assigned

    Labels

    discussionrequires active participation to reach a conclusion

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions