Skip to content

[wanda] refactor docker_cmd to be container_cmd#329

Draft
andrew-anyscale wants to merge 2 commits intomainfrom
andrew/revup/main/refactor-for-podman
Draft

[wanda] refactor docker_cmd to be container_cmd#329
andrew-anyscale wants to merge 2 commits intomainfrom
andrew/revup/main/refactor-for-podman

Conversation

@andrew-anyscale
Copy link
Contributor

Refactoring common interface into container_cmd so we can
use podman or whatever backends we like.

Subsequent PR will add podman_cmd once the interface looks
clean

Topic: refactor-for-podman

Signed-off-by: andrew andrew@anyscale.com

Refactoring common interface into container_cmd so we can
use podman or whatever backends we like.

Subsequent PR will add podman_cmd once the interface looks
clean

Topic: refactor-for-podman

Signed-off-by: andrew <andrew@anyscale.com>
@andrew-anyscale
Copy link
Contributor Author

Reviews in this chain:
#329 [wanda] refactor docker_cmd to be container_cmd
 └#330 [wanda] add podman backend

@andrew-anyscale
Copy link
Contributor Author

andrew-anyscale commented Dec 19, 2025

# head base diff date summary
0 cbee1681 77680285 diff Dec 19 1:33 AM 9 files changed, 775 insertions(+), 226 deletions(-)
1 b521b0f0 77680285 diff Dec 19 2:26 AM 2 files changed, 9 insertions(+), 68 deletions(-)

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @andrew-anyscale, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant refactoring to abstract the underlying container runtime commands. By creating a generic ContainerCmd interface and a baseContainerCmd implementation, the system is now decoupled from a specific container technology like Docker. This foundational change paves the way for easily integrating alternative container runtimes, such as Podman, in future updates, enhancing flexibility and portability without requiring extensive modifications to the core image building logic.

Highlights

  • New ContainerCmd Interface: Introduced a new ContainerCmd interface in wanda/container_cmd.go to abstract common container operations like building, pulling, inspecting, and tagging images. This interface will allow for interchangeable container runtime backends.
  • Base Container Command Implementation: Added a baseContainerCmd struct that provides common functionality for executing shell commands, pulling images, inspecting image information, and tagging images. This struct is embedded by specific container runtime implementations.
  • Docker Command Refactoring: The existing dockerCmd in wanda/docker_cmd.go has been refactored to embed the new baseContainerCmd and implement the ContainerCmd interface. This significantly reduces redundant code and centralizes common logic.
  • Forge Configuration Update: The ForgeConfig struct in wanda/forge_config.go now includes ContainerRuntime and ContainerBin fields, replacing the Docker-specific configuration. A newContainerCmd() method was added to ForgeConfig to instantiate the correct ContainerCmd implementation based on the configured runtime.
  • Updated Forge Logic: The Forge struct in wanda/forge.go has been updated to use the ContainerCmd interface for all container-related operations, ensuring compatibility with different runtimes. The newDockerCmd method was removed from Forge.
  • Comprehensive Testing for ContainerCmd: A new test file wanda/container_cmd_test.go was added, providing extensive tests for the ContainerCmd interface across different container runtimes (currently Docker and a placeholder for Podman). These tests cover image building, layer verification, environment variables, and label inspection.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a solid step towards abstracting the container runtime by introducing a ContainerCmd interface and a baseContainerCmd. This will make it easier to add support for other runtimes like Podman in the future. The changes are well-structured, and new tests have been added to cover the new interface.

My review includes a few suggestions to improve the new abstraction:

  • I've pointed out a significant code duplication in the build method between dockerCmd and baseContainerCmd that should be addressed to improve maintainability.
  • There's an inconsistency in how container command instances are created within the Build method, which could lead to state issues.
  • I also noticed some Docker-specific logic in the generic baseContainerCmd that should be moved to the dockerCmd implementation to keep the base class truly generic.

Addressing these points will make the abstraction cleaner and more robust.

}

// build overrides baseContainerCmd.build to add --progress=plain for BuildKit.
func (c *dockerCmd) build(in *buildInput, core *buildInputCore, hints *buildInputHints) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The build method in dockerCmd is almost a complete copy of the build method in baseContainerCmd. This creates significant code duplication. The main difference is adding the --progress=plain flag.

Consider refactoring to eliminate this duplication. One approach is to make baseContainerCmd.build more extensible. For instance, you could have dockerCmd.build prepare Docker-specific arguments and then call a common, unexported function from baseContainerCmd that contains the shared build logic.

This would make the code more maintainable and align better with the goal of this refactoring.

Comment on lines +84 to +90
if exitErr, ok := err.(*exec.ExitError); ok {
// Docker returns 1
code := exitErr.ExitCode()
if code == 1 {
return nil, nil
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This error handling logic is specific to the Docker CLI, which returns exit code 1 when an image is not found. Placing this in baseContainerCmd leaks implementation details of one container runtime into the generic base class. Other runtimes like Podman might behave differently.

To keep baseContainerCmd generic, this Docker-specific logic should be moved to dockerCmd. One way to do this is for dockerCmd to override inspectImage, run the command itself, and handle the exit code. While this might introduce some duplication, it would correctly encapsulate the runtime-specific behavior.

Comment on lines +247 to +253
var newCmd ContainerCmd
if f.config.ContainerRuntime == RuntimeDocker {
// Always use a new dockerCmd so that it can run in its own environment.
newCmd = f.config.newContainerCmd()
} else {
newCmd = f.containerCmd
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for creating a ContainerCmd instance is inconsistent. For RuntimeDocker, a new command instance is created for each build, but for other runtimes, the existing f.containerCmd instance is reused.

The comment on line 249 suggests creating a new instance is desirable to ensure a clean environment. If that's the case, this should be done for all container runtimes to prevent potential state-related bugs (e.g., from a modified workDir).

I suggest always creating a new command instance for each build, regardless of the runtime.

	// Now we can build the image.
	// Always use a new containerCmd so that it can run in its own environment.
	newCmd := f.config.newContainerCmd()

Topic: refactor-for-podman

Signed-off-by: andrew <andrew@anyscale.com>
@gitar-bot
Copy link

gitar-bot bot commented Dec 19, 2025

Code Review ⚠️ Changes requested

Well-structured refactoring to abstract container runtime, but newContainerCmd() ignores the runtime setting and always creates Docker.

⚠️ Bug: newContainerCmd ignores ContainerRuntime; always creates Docker

📄 wanda/forge_config.go:38

The ForgeConfig.newContainerCmd() method (in forge_config.go) always creates a DockerCmd regardless of the ContainerRuntime setting. It ignores RuntimePodman entirely, making the Podman runtime support incomplete.

Current code:

func (c *ForgeConfig) newContainerCmd() ContainerCmd {
    return NewDockerCmd(&DockerCmdConfig{
        Bin:             c.ContainerBin,
        UseLegacyEngine: runtime.GOOS == "windows",
    })
}

Expected behavior: Should check c.ContainerRuntime and create the appropriate command:

func (c *ForgeConfig) newContainerCmd() ContainerCmd {
    switch c.ContainerRuntime {
    case RuntimePodman:
        return NewPodmanCmd(&PodmanCmdConfig{Bin: c.ContainerBin})
    default:
        return NewDockerCmd(&DockerCmdConfig{
            Bin:             c.ContainerBin,
            UseLegacyEngine: runtime.GOOS == "windows",
        })
    }
}

This is the main purpose of the refactoring, but it's incomplete. Users specifying RuntimePodman will silently get Docker instead.

More details 💡 2 suggestions
💡 Code Quality: Hardcoded podmanAvailable() always returns false

📄 wanda/container_cmd_test.go:24

The test file container_cmd_test.go has a hardcoded podmanAvailable() function that always returns false:

func podmanAvailable() bool {
    return false
}

While the dockerAvailable() function properly checks for the binary:

func dockerAvailable() bool {
    _, err := exec.LookPath("docker")
    return err == nil
}

This looks like intentional placeholder code (perhaps Podman support is WIP), but it would be better to either:

  1. Add a proper check similar to dockerAvailable(): _, err := exec.LookPath("podman"); return err == nil
  2. Add a comment explaining why it's disabled (e.g., "// TODO: Enable when Podman support is complete")
💡 Code Quality: Inconsistent new instance behavior between Docker and other runtimes

📄 wanda/forge.go:247

In forge.go, the build process creates a new container command only for Docker but reuses the existing one for other runtimes:

var newCmd ContainerCmd
if f.config.ContainerRuntime == RuntimeDocker {
    // Always use a new dockerCmd so that it can run in its own environment.
    newCmd = f.config.newContainerCmd()
} else {
    newCmd = f.containerCmd
}

The comment says Docker needs a new instance "so that it can run in its own environment", but the same logic is not applied to other runtimes. This inconsistency could lead to subtle issues if Podman also benefits from isolated instances, or it may indicate the Docker path has unnecessary complexity.

Consider either:

  1. Always creating a new instance for all runtimes (if the isolation is genuinely needed)
  2. Documenting why Docker specifically needs this but others don't
  3. Refactoring to always reuse if the new instance isn't actually necessary

What Works Well

  • Clean interface abstraction (ContainerCmd) that properly separates concerns and enables future runtime support
  • Good use of composition with baseContainerCmd to share common functionality
  • Comprehensive test coverage with parameterized tests that iterate over container runtimes
  • Proper sorting of maps for deterministic behavior in build arguments and FROM image handling

Recommendations

  • The key issue is that ForgeConfig.newContainerCmd() always returns a Docker command regardless of the ContainerRuntime setting - this needs to be fixed for Podman support to work
  • Consider implementing proper Podman availability check in tests rather than hardcoding return false
  • Clarify the reasoning for creating new instances only for Docker vs. reusing for other runtimes in the build process
Options

Auto-apply is off Gitar will not commit updates to this branch.
✅ Code review is on Gitar will review this change.
Display: compact Hiding non-applicable rules.

Comment with these commands to change:

Auto-apply ✅ Code review Compact
gitar auto-apply:on         
gitar code-review:off         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | This comment will update automatically (Docs)

Copy link
Collaborator

@aslonnie aslonnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before we commit to this, could you show a proof of concept that it can build ray wheels and images without docker? on both buildkite or within a ray image?

docker daemon is builtin supported on buildkite AMI, but podman is not.

one concern is that only some more recent versions of podman supports multi-line HEREDOC command syntax, which we uses in ray wheels and image building. I am also not sure about the multi-stage support.

@andrew-anyscale
Copy link
Contributor Author

before we commit to this, could you show a proof of concept that it can build ray wheels and images without docker? on both buildkite or within a ray image?

Totally agree! Goal of this PR was to refactor to make room for other potential backends. Be that podman, or another appropriate fit should podman not be it

one concern is that only some more recent versions of podman supports multi-line HEREDOC command syntax, which we uses in ray wheels and image building. I am also not sure about the multi-stage support.

I'm dogfooding the podman implementation now, and just started running into this for the HEREDOC command syntax. I'm planning to check out other backends as well to see if any fit better

@andrew-anyscale andrew-anyscale marked this pull request as draft December 22, 2025 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants