CI flake: migrating out of a VM doesn't always ensure its VMM resources are cleaned up

Example victim run (there have been others): https://github.com/oxidecomputer/propolis/pull/646/checks?check_run_id=21668964010

All the tests in this run passed, but the `migration_smoke` test's source VM's VMM handle was leaked, causing the overall run to fail.

From the test logs we can see that Propolis returned a 404 when trying to stop this VM. This is expected from a migration source, since the VM controller gets torn down once a migration succeeds:

```
phd-runner: [VM CLEANUP - EVENT] error stopping VM to move it to Destroyed
    error = Error Response: status: 404 Not Found; headers: {"content-type": "application/json", "x-request-id": "9ed33d45-46a7-4994-9d91-254dea28b142", "content-length": "84", "date": "Fri, 16 Feb 2024 20:24:02 GMT"}; value: Error { error_code: None, message: "Not Found", request_id: "9ed33d45-46a7-4994-9d91-254dea28b142" }
    file = phd-tests/framework/src/test_vm/mod.rs
    line = 917
    path = phd_tests::migrate::smoke_test
    target = phd_framework::test_vm
    vm = migration_smoke
    vm_id = c51ddb74-4366-4c94-8aa4-740ec0bc72b3
```

This is not a production-impacting problem because in a real control plane, Propolis runs in a zone, and migrating out of a Propolis will (or should) direct the control plane to destroy the zone, which will clean up all remaining VMM resources. Still, this is an annoying flake and we should fix it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CI flake: migrating out of a VM doesn't always ensure its VMM resources are cleaned up #648

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CI flake: migrating out of a VM doesn't always ensure its VMM resources are cleaned up #648

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions