Fix volume plugin lazy init#28877
Conversation
056fba8 to
4795354
Compare
| // Create the libpod alive file (runtime already initialized). | ||
| init := podmanTest.Podman([]string{"volume", "ls", "-q"}) | ||
| init.WaitWithDefaultTimeout() | ||
| Expect(init).Should(ExitCleanly()) |
There was a problem hiding this comment.
that does nothing since the prior commands already did this
|
|
||
| alivePath := filepath.Join(podmanTest.TmpDir, "alive") | ||
| err = os.Remove(alivePath) | ||
| Expect(err).ToNot(HaveOccurred()) |
There was a problem hiding this comment.
That is somewhat dangerous and should need at least comments why this is done
| netName := "refresh-net-" + stringid.GenerateRandomID() | ||
| session := podmanTest.Podman([]string{"--log-level=error", "network", "create", netName}) | ||
| session.WaitWithDefaultTimeout() | ||
| Expect(session.ExitCode()).To(Equal(0)) | ||
| Expect(session.ErrorToString()).NotTo(ContainSubstring("some functionality may not be available")) | ||
| Expect(session.ErrorToString()).NotTo(ContainSubstring("uses volume plugin")) |
There was a problem hiding this comment.
negative checks are not good and cover very little. The correct thing is to check for an empty stderr wich ExitCleanly does by default
Also do not create a new network, this will get leaked on errors. you can run any command here to trigger the refresh.
| inspect := podmanTest.Podman([]string{"volume", "inspect", volName}) | ||
| inspect.WaitWithDefaultTimeout() | ||
| Expect(inspect).To(ExitWithErrorRegex(125, "cannot inspect")) | ||
| Expect(inspect.ErrorToString()).NotTo(ContainSubstring("some functionality may not be available")) |
| init := podmanTest.Podman([]string{"volume", "ls", "-q"}) | ||
| init.WaitWithDefaultTimeout() | ||
| Expect(init).Should(ExitCleanly()) | ||
|
|
||
| podmanTest.StopContainer(ctrName) | ||
|
|
||
| ls := podmanTest.Podman([]string{"--log-level=error", "volume", "ls", "-q"}) | ||
| ls.WaitWithDefaultTimeout() | ||
| Expect(ls).To(ExitWithErrorRegex(125, "cannot inspect")) | ||
| Expect(ls.ErrorToString()).NotTo(ContainSubstring("some functionality may not be available")) |
There was a problem hiding this comment.
same thing here, and just combine this with the test case above to save code and time
|
|
||
| vol1 := "refreshVol1-" + stringid.GenerateRandomID() | ||
| vol2 := "refreshVol2-" + stringid.GenerateRandomID() | ||
| create1 := podmanTest.Podman([]string{"volume", "create", "--driver", pluginName, vol1}) |
There was a problem hiding this comment.
not for all the commands that should exited with error all new test code should be using PodmanExitCleanly instead to reduce boiler plate
| v.volumePlugin = sync.OnceValues(func() (*plugin.VolumePlugin, error) { | ||
| return p, nil | ||
| }) | ||
| } |
There was a problem hiding this comment.
That seems incorrect and hard to use correctly, it is at least not how sync.OnceValues is supposed to be used.
You assign a new function pointer to a struct field, calling these in parallel is thus not safe and also more complicated to use.
The proper way should be to not have either of these fucntions. When retrieving the volume from the db the volumePlugin field should already be set with the sync.OnceValues in place.
And then all callers need to just call v.volumePlugin()
|
Also please squash the commits, we prefer test and code in one commit. |
4795354 to
5f14dc3
Compare
|
Hi @Luap99 Thanks for the review. Made some respective changes based on your comments and squashed commits. Would very much appreciate another check of yours |
|
Sorry for the delay, I have not done a deep look but this looks about right on a quick skim. You have a merge conflict though so please rebase the PR to latest main to resolve that conflict |
Add e2e regression tests for refresh and lazy plugin use paths. Signed-off-by: Yoonseo Han <yooncer00@gmail.com>
5f14dc3 to
914e518
Compare
|
[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore. |
|
/packit test |
|
No worries. Rebased onto latest main and fixed any conflicts. Some build fails that doesn't look related to this PR appeared. Mind taking a look when you have a moment? Thanks |
What this PR does
After reboot,
Runtime.refresh()loads all volumes from the database. Previously,finalizeVolumeattempted to contact the volume plugin for every plugin-backed volume and logged errors when the socket was not ready. This lead to volume unrelated commands to leave error logs during refrsh.This defers plugin initialization until a volume operation actually needs the plugin. Volume create still validates the plugin eagerly.
Fixes #28110
Checklist
Ensure you have completed the following checklist for your pull request to be reviewed:
commits. (
git commit -s). (If needed, usegit commit -s --amend). The author email must matchthe sign-off email address. See CONTRIBUTING.md
for more information.
Fixes: #00000in commit message (if applicable)make validatepr(format/lint checks)Noneif no user-facing changes)Does this PR introduce a user-facing change?
How is the code tested
Each test added in the e2e section tests the followign aspects
unrelated command after refresh does not log plugin errors for plugin volumes): Creatres plugin backed volumes in DB then forces refresh by deleting alive file. Runs network create and check if network succeeds and no plugin error logs appearsvolume inspect with stopped plugin errors without finalize spam): Creates plugin backed volume, stops the plugin and runs volume inspect. Check if instpect fails with correct error and no finalize spam appearsvolume ls with stopped plugin fails on inspect without finalize spam): Creates plugin backend volume, stops the plugin and runs volume ls. Check if ls fails with correct error and no finalize spam appears