Implement Standalone SOCI Image Convert mode for SOCI index creation without containerd#1870
Implement Standalone SOCI Image Convert mode for SOCI index creation without containerd#1870prafgup wants to merge 1 commit into
Conversation
|
Hey SOCI Devs 👋 Also happy to followup with any changes (in this or a followup PR) 😄 |
|
|
||
| orasStore, err := oci.New(tmpDir) | ||
| if err != nil { | ||
| cleanupTempDir(tmpDir) |
There was a problem hiding this comment.
Can we defer this instead of doing it in every error branch? You probably want to cleanup on success as well?
There was a problem hiding this comment.
I had a look earlier into refactoring this and figured that it would require us having named returns and then checking error in the defer func.
Overall using named returns make things somewhat complex to read, thus I let it be like this where we would cleanup on specific errors.
We do always cleanup in success scenarios here upstream - https://github.com/prafgup/soci-snapshotter/blob/83db223b6161ecde510d4a437d432eebb81baf35/cmd/soci/commands/convert.go#L207 lemme check if we can remove this manual cleanup then in case of errors and just move imageInfo.Cleanup() to be before error.
There was a problem hiding this comment.
So in case of error upstream runStandaloneConvert wont have imageInfo to trigger a cleanup, thus I think its fine to keep it as is. Upstream always makes sure to cleanup in case DownloadImageFromRegistry is successful, and in case of failure in DownloadImageFromRegistry we would manually cleanup before returning a error.
There was a problem hiding this comment.
I'm not really a fan of this pattern since we have to be sure every single error out handles this error which is not very Go-like.
I've seen a couple of patterns that I think would flow a little nicer:
- Do initial setup for the tempdir, then have a separate function (e.g.
downloadImageFromRegistry) that returns an error. Call that func fromDownloadImageFromRegistryand on error call the cleanup func. This is probably the least error-prone way to do this. - Declare error var after setting up the dir, then defer a func that calls the cleanup func if
err != nil. I generally prefer this option less as if you declare an error within a nested function and forget to refer to the global err var you can inadvertently not call cleanup.
Signed-off-by: Praful Gupta <prafulgupta6@gmail.com>
| fmt.Printf("Standalone mode: downloading image %s\n", srcRef) | ||
| } | ||
|
|
||
| imageInfo, err := internal.DownloadImageFromRegistry(ctx, cmd, srcRef) |
There was a problem hiding this comment.
seems we are adding a operation which is not strictly done by convert. Soci doesnt have a pull command probably because it assumes ctr/ other cli are providing the image is there.
Overall seems like architecturally different to the purpose of convert. Any reason we cannot pull image to soci content store and do the convert and then push. Probably adds complexity vs being architecturally consistent.
There was a problem hiding this comment.
Right, if we were to think of this feature as of multiple steps (pull + convert + push) it would be a bit overly complex where -
- We would have to maintain a persistent local content store across command invocations, essentially creating a replacement for containerd store.
- Let a lot of abstractions from
StandaloneImageInfobe handled by user across commands. - This feature is usually going to be used by CI/CD jobs, where having a single command is much simpler for automation.
I believe having a bit of tradeoff of consistency for simplicity would be the better choice here, as we don't see much extension of standalone convert in future atleast architecturally.
On a side note we do currently have deprecated push and create command which does not support V2 Manifests - https://github.com/awslabs/soci-snapshotter/blob/main/docs/cli-usage.md#soci-create and seems like we focused more on the convert command for V2 Manifests, thus I believe its the right place for standalone convert too.
What do you think? 😄
There was a problem hiding this comment.
Yeah i agree with the cost but doing a push and pull in convert seems a bit hacky. Having a single command is ok as long as it is fundamentally doing a specific operation it was designed for. Still think having a pull/convert and push command here makes better architectural sense. For daemon less services we can use the soci content store.
@sondavidb what do you think?
There was a problem hiding this comment.
Hey @Shubhranshu153 @sondavidb :)
Requesting an update here regarding this thread for the path forward ^ (this is blocking some development on our end 😓)
One of the other possible option that I can think of apart from keeping this as a flag in convert instead, is that we can have another independent command something like standalone-convert that would do this flow of convert --standalone.
But I do believe that with proper documentation convert --standalone does seem to be the right place for this logic (of converting a image upstream) instead of expanding it across multiple pull/convert/push commands (where push has already been deprecated for V2 use).
There was a problem hiding this comment.
if we really dont want to architect it entirely one way can be to introduce to introduce a --pull and use the soci store but push i think still needs to be a separate cli command. And push is not deprecated rather its not extended to v2 migration. Having pull/convert/push in a single cli is inherently changing the cmd purpose.
There was a problem hiding this comment.
@sondavidb thanks for the review, I glanced over other commends and would address them later. But just want to confirm this main flow.
You want to pull with your CLI command of choice, convert it to a SOCI v2 image, then push the image, which will in turn push the SOCI index with it.
Just to confirm the wording, are you suggesting something like -
crane pull nginx:latest IMG_TAR_PATH
soci convert --standalone IMG_TAR_PATH IMG_OUTPUT_TAR_PATH
crane push IMG_OUTPUT_TAR_PATH my-reg/my-image:my-tag
I believe this makes sense as well. A user can pull the image as tar in some path and we would load that up and index it and store as output. Also in this case we won't have to do any registry authentications as we are neither pulling nor pushing.
Would you prefer this logic to be made explicit by user sending in --standalone flag in create command? Or should we auto detect that the path given (input and output) is from disk and if a tar file exist there proceed with conversion without containerd?
There was a problem hiding this comment.
@david for v2 don't we push the entire image with the index together. Here standalone means not using containerd rather doing everything using the soci cli
Also the delete option needs to be created but if soci store is used to pull convert and push images as we need to clean it up also. Or we create tmp store abstraction. (Still I think a delete is needed)
There was a problem hiding this comment.
I really like the idea of using some other cli to pull and push images if that's an option @prafgup
There was a problem hiding this comment.
Yeah, what you said would be my ideal scenario @prafgup. It is honestly probably the least intrusive solution 🙂
There was a problem hiding this comment.
Hey @sondavidb @Shubhranshu153 ,
I have created another PR here - #1881
Fundamentally its a bit different from this one thus thought of closing this PR and creating a new clean one.
Would appreciate a Review on the same 😄 (I'll update the docs in a followup once that is merged. Also happy to resolve any small nits and refactoring as followups there as well as Update the docs in followups)
| return nil | ||
| } | ||
|
|
||
| func parseBuilderOptions(cmd *cli.Command) ([]soci.BuilderOption, error) { |
There was a problem hiding this comment.
This is also used for create, we should probably do some deduping here. Happy to take that as a followup though.
|
|
||
| orasStore, err := oci.New(tmpDir) | ||
| if err != nil { | ||
| cleanupTempDir(tmpDir) |
There was a problem hiding this comment.
I'm not really a fan of this pattern since we have to be sure every single error out handles this error which is not very Go-like.
I've seen a couple of patterns that I think would flow a little nicer:
- Do initial setup for the tempdir, then have a separate function (e.g.
downloadImageFromRegistry) that returns an error. Call that func fromDownloadImageFromRegistryand on error call the cleanup func. This is probably the least error-prone way to do this. - Declare error var after setting up the dir, then defer a func that calls the cleanup func if
err != nil. I generally prefer this option less as if you declare an error within a nested function and forget to refer to the global err var you can inadvertently not call cleanup.
| return nil | ||
| } | ||
|
|
||
| func newRemoteRepo(cmd *cli.Command, refspec reference.Spec) (*remote.Repository, error) { |
There was a problem hiding this comment.
Is it possible to reuse anything from the fs package? (I haven't checked but I'm wary of re-duplicating logic, especially when it comes to registry authentication.)
| }, | ||
| } | ||
|
|
||
| if cmd.Bool("skip-verify") { |
There was a problem hiding this comment.
This seems inverted?
| if cmd.Bool("skip-verify") { | |
| if !cmd.Bool("skip-verify") { |
| ) | ||
|
|
||
| func TestStandaloneConvertBasic(t *testing.T) { | ||
| t.Parallel() |
There was a problem hiding this comment.
nit: Little torn on this but OK with it. Our integration tests run a separate container for each test (newShellWithRegistry I believe composes a container) so parallelizing too much might be a lot for the runner.
NEW PR - #1881
Issue #, if available:
Closes #1057
Description of changes:
This PR adds standalone mode to the
soci convertcommand, enabling SOCI index creation without requiring a running containerd daemon. This is particularly useful for CI/CD pipelines and environments where running containerd is impractical without privileged mode.Key Features
--userflag, Docker config, and environment variables--skip-verifyand--plain-httpflags--platformand--all-platformsflags--verboseflag for detailed progress informationUsage Example
Testing performed:
- Basic conversion with validation
- Specific platform selection (--platform)
- Authentication with --user flag
- Error handling (nonexistent images, missing arguments)
- Idempotency verification
lima sudo GO_TEST_FLAGS="-run TestStandaloneConvert" make integrationlima sudo make testIntegration tests -
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.