Skip to content

refactor: Refactor snapshot/migrate functionality to rename to "save" #1555

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

sdelliot
Copy link
Contributor

This PR accomplishes the primary task of #1540 with the refactoring of vm snapshot and vm migrate. This includes the following:

Breaking Changes

  • Rename vm migrate -> vm save
  • Rename ns snapshot -> ns save
  • Rename vm configure migration -> vm configure state

Additional Changes

  • Refactoring the code to replace most references to migrate/snapshot. There are some exceptions including 1) where snapshot is used as in disk snapshot and 2) where there were only internal references but could conflict with "state" where state could be overloaded (e.g., vm state (BUILDING, RUNNING,...) or the saved VM memory).
  • Fixed multiple bugs where the image that is "restored" would not work properly if there was a backing image. It also did not account for extra disk parameters (e.g., "virtio,writeback"
  • Adding namespace queueing to the ns save output.

@activeshadow
Copy link
Contributor

@sdelliot is it possible to leave the old commands in place for some amount of time (perhaps marked as deprecated)? Renaming the commands will break some functionality in phenix, so if they can't be left in for a little bit then we might want to coordinate some activities.

Thoughts?

/cc @jacdavi @glattercj

@sdelliot
Copy link
Contributor Author

@sdelliot is it possible to leave the old commands in place for some amount of time (perhaps marked as deprecated)? Renaming the commands will break some functionality in phenix, so if they can't be left in for a little bit then we might want to coordinate some activities.

That totally makes sense to me. I adjusted it so that the old commands still exist but point to the new functionality. So there should no longer be API changes but the underlying functionality will still be modified. I tested it out and it looks like it works as expected.

Copy link
Contributor

@jacdavi jacdavi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't had a chance to run things yet, but did a quick read through.

One general question, compared to #1540 we're losing the functionality of saving just the disk of a VM. I think that's what we had agreed on previously, but I'm just wondering again if that feature was useful.

log.Info("Copied existing snapshot file from %s to %s", src, dst)
return nil
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by this section, but may just be getting things mixed up.

If a user has disk B backed by disk A and calls "snapshot B C" my expectation would be a backing chain of C -> B -> A, but it seems like this will produce C(really A), B -> A which breaks the chain. Is there a reason we want that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this section was largely put in for two reasons.

  1. The chain of snapshots C -> B -> A was not working correctly. I haven't tested now that fix error when rebasing onto no backing image #1557 was merged, but that may have fixed the issue. It also could have been that one of the intermediate images, B wasn't in the "files" directory. I'll test with the latest changes and see what happens.
  2. The secondary reason is that having a chain of snapshots makes it more difficult for a user to actually capture (save off/compress/etc.) the entire chain. Suppose you have a chain B->A where A is the backing image and B is the running version. Then you snapshot that, we save off the state of B. Then when you start the experiment again, this method is called and creates a new snapshot C which has the chain C->B->A. With this code, rather than creating a new C we will take a copy of B. So that if you snapshot again the original case would end up with C->B->A , but this version will end up with B'->A where B' is C+B. Basically, instead of creating continually longer chains, you never end up with a chain larger than two.

@sdelliot
Copy link
Contributor Author

sdelliot commented Apr 3, 2025

One general question, compared to #1540 we're losing the functionality of saving just the disk of a VM. I think that's what we had agreed on previously, but I'm just wondering again if that feature was useful.

That's correct. How it was previously implemented, I believe the VM has to be paused to save off the disk, assuming that's correct, there is no loss of functionality as users can just ignore the memory state. If we don't pause a VM and instead used a live migration of just the disk then there would be some loss of functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants