-
Notifications
You must be signed in to change notification settings - Fork 8
feat: Allow using custom containers with pre-built dependencies #63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Allow using custom containers with pre-built dependencies #63
Conversation
Switch to using UV instead of poetry/conda
Switch to using flit instead of setuptools
Rewrite tests to actually use mocks instead of still trying to create a real client
Add more logging to executor and actually implement containers
Add pre-commit to check common problems
📝 WalkthroughWalkthroughTransition CI/release from Poetry to uv, add pre-commit and .gitattributes, introduce a COS micromamba example, extend executor for container-aware execution with new settings and APIs, add focused unit tests, and remove mocked/true API test suites. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant S as Snakemake
participant E as GoogleBatchExecutor
participant W as CommandWriter
participant B as BatchServiceClient
S->>E: submit(job)
E->>E: is_container_job(job)?
alt Container job
E->>E: format_job_exec(job)
E->>W: get_writer(container-aware)
E->>B: create_job(spec with entrypoint/split commands)
else Non-container job
E->>W: get_writer(default)
E->>B: create_job(spec with setup+exec)
end
loop Poll status
E->>B: get_job(status)
B-->>E: status (RUNNING/SUCCEEDED/FAILED)
end
alt SUCCEEDED
E->>E: save logs
E-->>S: job complete
else FAILED
E->>E: save logs (error)
E-->>S: raise WorkflowError
end
note right of E: Key changes: container detection, entrypoint/split handling, new settings, and enhanced logging
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
snakemake_executor_plugin_googlebatch/executor.py (1)
585-593: Incorrect Timestamp field; events may be missedTimestamp uses seconds and nanos, not nanosecond. Compare a monotonic nanoseconds value to avoid missing events across seconds.
- self.logger.info(f"Job {jobid} has state {response.status.state.name}") - for event in response.status.status_events: - if not last_seen or event.event_time.nanosecond > last_seen: - self.logger.info(f"{event.type_}: {event.description}") - last_seen = event.event_time.nanosecond - - # Update last seen for next time (TODO not sure this is sticking) - j.aux["last_seen"] = last_seen + self.logger.info(f"Job {jobid} has state {response.status.state.name}") + for event in response.status.status_events: + evt = event.event_time + event_ts_ns = getattr(evt, "seconds", 0) * 1_000_000_000 + getattr(evt, "nanos", 0) + if last_seen is None or event_ts_ns > last_seen: + self.logger.info(f"{event.type_}: {event.description}") + last_seen = event_ts_ns + + # Update last seen for next time + j.aux["last_seen"] = last_seen
🧹 Nitpick comments (18)
.pre-commit-config.yaml (2)
8-16: Streamline mypy hook deps
- Drop numpy from additional_dependencies unless you import it solely for typing; numpy ships type hints. This will speed up hook environment creation.
- If needed, prefer pinning stubs explicitly (e.g., types-numpy) over pulling full numpy wheels into the hook env.
1-7: Consider adding basic hygiene hooksAdding end-of-file-fixer and trailing-whitespace helps keep diffs clean repo-wide.
Example snippet to append:
- repo: https://github.com/pre-commit/pre-commit-hooks rev: v5.0.0 hooks: - id: end-of-file-fixer - id: trailing-whitespaceAlso applies to: 27-30
.github/workflows/release-please.yml (1)
34-36: Optional: Drop dependency sync for publishing
uv build/uv publishdon't require project deps installed. You can skipuv sync --all-extrashere to speed up releases..github/workflows/ci_mocked_api.yaml (1)
35-40: Optional: Enforce coverage threshold or upload report
- Add
uv run coverage report -m --fail-under=<min>to fail on low coverage.- Or upload
.coverage/HTML as an artifact for inspection.example/hello-world-cos/Dockerfile (3)
15-17: Remove duplicate package and install bash only if needed
- git appears twice; drop the duplication.
- If keeping the entrypoint change to exec without bash, no action needed. If you keep
bashusage, addbashto apt packages.- && apt-get -qq install --no-install-recommends -y curl git ca-certificates git python3-pip build-essential gnupg + && apt-get -qq install --no-install-recommends -y curl git ca-certificates python3-pip build-essential gnupg +# If your entrypoint needs bash, add it: +# && apt-get -qq install --no-install-recommends -y bash
39-40: Ensure entrypoint is executableIf the executable bit isn’t set in git, make it explicit to avoid surprises.
COPY docker-entrypoint.sh / +RUN chmod +x /docker-entrypoint.sh ENTRYPOINT ["/docker-entrypoint.sh"]
35-36: Optional: Collapse environment creation to reduce layersUnless you rely on caching downloads across builds, a single create step is simpler.
-RUN micromamba env create --download-only --yes -f ./environment.yml -RUN micromamba env create --yes -f ./environment.yml +RUN micromamba env create --yes -f ./environment.ymlsnakemake_executor_plugin_googlebatch/snippet.py (1)
154-160: Avoid re-creating Jinja2 Environment per callConstructing a new Environment/PackageLoader on every template load is wasteful. Cache a module- or class-level Environment and reuse it.
- tmpEnv = jinja2.Environment( - loader=jinja2.PackageLoader( - "snakemake_executor_plugin_googlebatch", "snippets" - ) - ) - return tmpEnv.get_template(self.spec[name]) + if not hasattr(self, "_tmpl_env"): + self._tmpl_env = jinja2.Environment( + loader=jinja2.PackageLoader( + "snakemake_executor_plugin_googlebatch", "snippets" + ) + ) + return self._tmpl_env.get_template(self.spec[name])snakemake_executor_plugin_googlebatch/executor.py (5)
251-256: Avoid hard-coded container Snakefile pathPrefer workflow.workdir to build the container path.
- snakefile_path = "./Snakefile" - if self.is_container_job(job): - snakefile_path = "/tmp/workdir/Snakefile" + snakefile_path = "./Snakefile" + if self.is_container_job(job): + snakefile_path = os.path.join(self.workflow.workdir, "Snakefile")
559-561: Use workflow.workdir for container Snakefile pathMirror the writer’s path for consistency and configurability.
- if job and self.is_container_job(job): - return "/tmp/workdir/Snakefile" + if job and self.is_container_job(job): + return os.path.join(self.workflow.workdir, "Snakefile")
668-680: Consider timeout when waiting for deletionoperation.result() can block indefinitely. Pass a timeout and handle exceptions.
- response = operation.result() + response = operation.result(timeout=300)
309-314: Log the actual Snakemake commandYou print a header but not the command string. Include it for debugging.
- self.logger.info("\n🐍️ Snakemake Command:") + self.logger.info("\n🐍️ Snakemake Command:\n%s", run_command)
195-199: Derive container bind mount and workdir from workflow.workdir
Runnable.Container.volumes supports Docker-style bind mounts (e.g./host/path:/container/path), so your explicit"/tmp/workdir:/tmp/workdir"is valid. Replace the hardcoded"/tmp/workdir"in bothcontainer.volumesand the--workdiroption withworkflow.workdirto avoid magic paths.snakemake_executor_plugin_googlebatch/__init__.py (2)
53-61: Help text polish: typo and clarityMinor fix to “envtrypoint” and improve wording.
- "help": "If set to True, do not install snakemake or any dependencies. Google batch runs will drop into the envtrypoint and run commands assuming that snakemake and all dependencies are available and set up on the path.", + "help": "If True, do not install Snakemake or dependencies. Google Batch runs will use the container entrypoint and assume Snakemake and all dependencies are already on PATH.",
190-196: Help text typo“bath-centos” -> “batch-centos”.
- "help": "Boot disk image (e.g., batch-debian, bath-centos)", + "help": "Boot disk image (e.g., batch-debian, batch-centos)",tests/test_executor.py (3)
170-191: Add assertion for container Snakefile pathTest should ensure the container path is used in the command when container_dependencies_installed is True.
def test_format_job_exec_with_container_dependencies_installed(self, executor, job): # Case 1: Container job and dependencies installed - should use custom exec executor.workflow.executor_settings.container_dependencies_installed = True job.resources = {"googlebatch_image_family": "batch-cos-stable"} - with patch.object(RemoteExecutor, "format_job_exec") as mock_super_format: - executor.format_job_exec(job) - mock_super_format.assert_not_called() + with patch.object(RemoteExecutor, "format_job_exec") as mock_super_format: + cmd = executor.format_job_exec(job) + mock_super_format.assert_not_called() + assert os.path.join(executor.workflow.workdir, "Snakefile") in cmd
87-96: Unused fixture argumentexecutor_settings isn’t used in this fixture; drop it to silence ARG001.
-@pytest.fixture -def executor(workflow, executor_settings): +@pytest.fixture +def executor(workflow):
203-206: Explicitly assert no-container pathGood coverage; consider also asserting that format_job_exec falls back to super() when not a container job.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
pyproject.tomlis excluded by!pyproject.tomluv.lockis excluded by!**/*.lock
📒 Files selected for processing (16)
.gitattributes(1 hunks).github/workflows/ci_mocked_api.yaml(2 hunks).github/workflows/ci_true_api.yml(0 hunks).github/workflows/release-please.yml(1 hunks).pre-commit-config.yaml(1 hunks)example/hello-world-cos/Dockerfile(1 hunks)example/hello-world-cos/README.md(1 hunks)example/hello-world-cos/docker-entrypoint.sh(1 hunks)setup.cfg(0 hunks)snakemake_executor_plugin_googlebatch/__init__.py(4 hunks)snakemake_executor_plugin_googlebatch/executor.py(16 hunks)snakemake_executor_plugin_googlebatch/snippet.py(2 hunks)tests/__init__.py(2 hunks)tests/test_executor.py(1 hunks)tests/tests_mocked_api.py(0 hunks)tests/tests_true_api.py(0 hunks)
💤 Files with no reviewable changes (4)
- .github/workflows/ci_true_api.yml
- tests/tests_true_api.py
- setup.cfg
- tests/tests_mocked_api.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
snakemake_executor_plugin_googlebatch/__init__.pytests/test_executor.pysnakemake_executor_plugin_googlebatch/executor.pytests/__init__.pysnakemake_executor_plugin_googlebatch/snippet.py
🧬 Code graph analysis (4)
example/hello-world-cos/docker-entrypoint.sh (1)
snakemake_executor_plugin_googlebatch/command.py (1)
run(125-143)
tests/test_executor.py (2)
snakemake_executor_plugin_googlebatch/__init__.py (1)
ExecutorSettings(22-276)snakemake_executor_plugin_googlebatch/executor.py (17)
get_param(32-42)get_task_resources(44-57)get_labels(59-73)add_storage(81-98)generate_jobid(100-105)is_container_job(219-222)fix_job_name(264-269)project_parent(394-400)format_job_exec(107-158)get_command_writer(238-262)get_container(160-217)get_accelerators(515-544)get_boot_disk(493-513)get_network_policy(465-481)get_service_account(483-491)check_active_jobs(565-614)cancel_jobs(668-682)
snakemake_executor_plugin_googlebatch/executor.py (2)
tests/test_executor.py (2)
job(71-83)workflow(39-67)snakemake_executor_plugin_googlebatch/command.py (5)
get_writer(207-217)setup(145-149)setup(179-186)setup(194-195)setup(203-204)
tests/__init__.py (1)
snakemake_executor_plugin_googlebatch/__init__.py (1)
ExecutorSettings(22-276)
🪛 actionlint (1.7.8)
.github/workflows/release-please.yml
41-41: shellcheck reported issue in this script: SC2086:info:2:31: Double quote to prevent globbing and word splitting
(shellcheck)
41-41: shellcheck reported issue in this script: SC2086:info:2:57: Double quote to prevent globbing and word splitting
(shellcheck)
🪛 markdownlint-cli2 (0.18.1)
example/hello-world-cos/README.md
59-59: Link text should be descriptive
(MD059, descriptive-link-text)
🪛 Ruff (0.14.0)
tests/test_executor.py
41-41: Probable insecure usage of temporary file or directory: "/tmp/workdir"
(S108)
42-42: Probable insecure usage of temporary file or directory: "/tmp/workdir"
(S108)
87-87: Unused function argument: executor_settings
(ARG001)
285-285: Unused method argument: mock_get_writer
(ARG002)
292-292: Probable insecure usage of temporary file or directory: "/tmp/test.log"
(S108)
317-317: Unused method argument: mock_get_writer
(ARG002)
323-323: Probable insecure usage of temporary file or directory: "/tmp/test.log"
(S108)
387-387: Unused method argument: args
(ARG002)
388-388: Unused method argument: kwargs
(ARG002)
snakemake_executor_plugin_googlebatch/executor.py
30-30: Avoid specifying long messages outside the exception class
(TRY003)
195-195: Probable insecure usage of temporary file or directory: "/tmp/workdir:/tmp/workdir"
(S108)
253-253: Probable insecure usage of temporary file or directory: "/tmp/workdir/Snakefile"
(S108)
560-560: Probable insecure usage of temporary file or directory: "/tmp/workdir/Snakefile"
(S108)
🔇 Additional comments (3)
.gitattributes (1)
1-1: LGTMDisabling diffs for uv.lock will keep PRs clean during uv migrations.
snakemake_executor_plugin_googlebatch/snippet.py (1)
72-72: LGTM: set-comprehension for spec parsingSimpler and faster than list->set conversion. No behavior change.
tests/__init__.py (1)
34-36: Faster polling for tests looks goodReducing seconds_between_status_checks improves test throughput.
- **`.github/workflows/release-please.yml`**: Quoted `$PYPI_USERNAME` and `$PYPI_PASSWORD` in the `uv publish` command. This ensures that credentials with spaces or special characters are correctly passed as single arguments, improving the robustness of the CI release process.
- **`example/hello-world-cos/Dockerfile`**:
- Added `RUN` commands to create `/mnt/share`, set `mambauser` as owner, and grant group write permissions.
- Declared `/mnt/share` as a `VOLUME`.
- This explicitly configures a shared volume within the Docker image, enabling applications to use `/mnt/share` for data sharing, which is critical for environments like Google Batch that mount shared storage. This prevents runtime errors due to missing directories or permission issues.
- **`example/hello-world-cos/README.md`**:
- Improved the link text for OS images documentation.
- Ensured the file ends with a newline.
- These changes enhance documentation readability and adhere to common text file formatting best practices.
- In `snakemake_executor_plugin_googlebatch/executor.py`, the `is_container_job` method now explicitly converts the `image_family` parameter to an empty string if `self.get_param(job, "image_family")` returns `None`. - This change prevents a `TypeError` that would occur if `image_family` was `None` when attempting to use the `in` operator (`"batch-cos" in family`). - The effect is that the `is_container_job` method will now safely evaluate to `False` when no `image_family` is specified or it is `None`, rather than raising an error.
…LI args - `snakemake_executor_plugin_googlebatch/executor.py`: Modified the call to `self.get_snakefile()` within `GoogleBatchExecutor.run_job` to pass the `job` object as an argument. - This change ensures that the `get_snakefile` method receives the necessary `job` context to correctly determine and format the `--snakefile` CLI argument for the Snakemake command executed on Google Batch, addressing a previously missing argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
example/hello-world-cos/README.md (1)
57-58: Use descriptive link text (MD059 still failing).Replace the generic “this link” anchor text with something that describes the destination (e.g., “Batch OS environment overview”) to satisfy MD059 and improve accessibility.
snakemake_executor_plugin_googlebatch/executor.py (1)
127-135: Critical: State mutation leak in format_job_exec.The mutation of
self.common_settings.auto_deploy_default_storage_providerat line 132 is never restored, causing the setting to remain disabled for all subsequent jobs. This breaks the auto-deploy functionality for non-container jobs that follow container jobs.Apply the previously suggested fix to wrap the mutation in a try/finally block:
- general_args = self.workflow.spawned_job_args_factory.general_args( - executor_common_settings=self.common_settings - ) - - # Disable installation - self.common_settings.auto_deploy_default_storage_provider = False - precommand = self.workflow.spawned_job_args_factory.precommand( - executor_common_settings=self.common_settings - ) + # Temporarily disable auto-deploy to avoid installation in container jobs + prev_auto = self.common_settings.auto_deploy_default_storage_provider + self.common_settings.auto_deploy_default_storage_provider = False + try: + general_args = self.workflow.spawned_job_args_factory.general_args( + executor_common_settings=self.common_settings + ) + precommand = self.workflow.spawned_job_args_factory.precommand( + executor_common_settings=self.common_settings + ) + finally: + self.common_settings.auto_deploy_default_storage_provider = prev_auto
🧹 Nitpick comments (2)
snakemake_executor_plugin_googlebatch/executor.py (2)
195-195: Consider making the workdir path configurable.The hardcoded
/tmp/workdirpath is flagged by static analysis (S108) and could be a security concern if the container environment varies. Consider making this path configurable through executor settings to improve flexibility and security posture.Example configuration:
+ container_workdir = self.get_param(job, "container_workdir") or "/tmp/workdir" - container.volumes = ["/tmp/workdir:/tmp/workdir"] + container.volumes = [f"{container_workdir}:{container_workdir}"]
287-314: Consider refactoring for clarity (optional).The logic flow for container usage could be more explicit. Consider extracting the container setup into a helper method to improve readability and testability.
Example refactor:
def _setup_task_runnable(self, job, writer): """Setup the task runnable, either as container or script.""" use_container = self.get_container(job, self.get_param(job, "entrypoint")) if use_container: return self._setup_container_runnable(use_container, writer) else: return self._setup_script_runnable(writer) def _setup_container_runnable(self, container, writer): """Setup container-based runnable.""" self.logger.info(f"container: {container}") runnable = batch_v1.Runnable() runnable.container = container snakefile_text = writer.write_snakefile() return runnable, snakefile_text def _setup_script_runnable(self, writer): """Setup script-based runnable.""" run_command = writer.run() self.logger.info("\n🐍️ Snakemake Command:") runnable = batch_v1.Runnable() runnable.script = batch_v1.Runnable.Script() runnable.script.text = run_command snakefile_text = writer.write_snakefile() return runnable, snakefile_text
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.github/workflows/release-please.yml(1 hunks)example/hello-world-cos/Dockerfile(1 hunks)example/hello-world-cos/README.md(1 hunks)snakemake_executor_plugin_googlebatch/executor.py(16 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- example/hello-world-cos/Dockerfile
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
snakemake_executor_plugin_googlebatch/executor.py
🧬 Code graph analysis (1)
snakemake_executor_plugin_googlebatch/executor.py (2)
tests/test_executor.py (2)
job(71-83)workflow(39-67)snakemake_executor_plugin_googlebatch/command.py (5)
get_writer(207-217)setup(145-149)setup(179-186)setup(194-195)setup(203-204)
🪛 LanguageTool
example/hello-world-cos/README.md
[grammar] ~4-~4: There might be a mistake here.
Context: ...stom image. Here is how to see images in the project: ```bash gcloud compute ima...
(QB_NEW_EN)
[grammar] ~54-~54: There might be a mistake here.
Context: ...h) for an example of using micromamba to build an image with all dependencies ins...
(QB_NEW_EN)
🪛 Ruff (0.14.0)
snakemake_executor_plugin_googlebatch/executor.py
30-30: Avoid specifying long messages outside the exception class
(TRY003)
195-195: Probable insecure usage of temporary file or directory: "/tmp/workdir:/tmp/workdir"
(S108)
253-253: Probable insecure usage of temporary file or directory: "/tmp/workdir/Snakefile"
(S108)
560-560: Probable insecure usage of temporary file or directory: "/tmp/workdir/Snakefile"
(S108)
🔇 Additional comments (5)
snakemake_executor_plugin_googlebatch/executor.py (5)
2-2: LGTM! Import additions support new functionality.The
shleximport enables secure command splitting for container execution, and the updated Google API imports align with the container-aware execution changes.Also applies to: 6-7, 14-14, 16-17
28-28: LGTM! Enhanced logging improves observability.The additional debug and info logs for task resources, job labels, and storage operations improve troubleshooting without impacting performance.
Also applies to: 56-56, 72-72, 87-98
219-223: LGTM! None guard implemented correctly.The
or ""coercion on line 221 properly handles the case whenimage_familyis None, preventing TypeError and safely returning False. This addresses the previous review comment.
565-565: LGTM! Type hints improve code clarity.The addition of type hints for
check_active_jobsandcancel_jobsparameters improves code documentation and enables better IDE support and type checking.Also applies to: 668-668
665-665: LGTM! Improved exception formatting.Using
!sformatting for exceptions provides cleaner string representation and is more explicit than default formatting.
|
Giving my +1 to this, believe this should be merged to main ASAP--this is a critical fix to incorporate a custom container, otherwise the plugin is limited to vanilla containers and is hard to work with. Thanks @mboulton-fathom! |
The use case I had for this was that I wanted to be able to run a job in google batch, with a container that already had the dependencies installed, because it might take 10+ minutes to install those dependencies and I didn't want to pay for 10 minutes * however many jobs I was running. There is quite a few places in the executor currently where it assumes that it's always going to install dependencies, which is why there's quite a few changes. I've added an example for how I was running it and it worked for me, I can't guarantee it will work for all custom containers though...?
I've added more logging as well. For example, I was getting a bit confused that if you don't have it configured correctly then the executor will currently default to silently not running a container if you don't have the correct settings. I've made it explicitly log
"Not using a container for this job."I cleaned up the CI pipelines, added some more tests, and changed it to use uv because it was a lot quicker as well.
I can see that this is quite a big PR so if you want me to split it up or something I can do that
Summary by CodeRabbit
New Features
Documentation
Tests
Chores