Parent epic: #11216
Follow-up to: #11218 / #11226
Main idea
PR #11226 routed every per-op Docker() instantiation through a shared long-lived client except in DockerKernel, because plumbing the shared client through would change DockerKernel.__init__'s contract. The remaining throwaway connections are:
src/ai/backend/agent/docker/kernel.py:121 — get_logs()
src/ai/backend/agent/docker/kernel.py:224 — commit() (bare Docker(), not even closing_async — also worth fixing for correctness)
src/ai/backend/agent/docker/kernel.py:327 — download_file()
src/ai/backend/agent/docker/kernel.py:356 — download_single()
src/ai/backend/agent/docker/kernel.py:486 — module-level prepare_krunner_env_impl() (separate case — not a DockerKernel method)
Design
- Add
docker: Docker as a keyword-only argument to DockerKernel.__init__, store on self.
- Update construction sites in
DockerAgent (agent.py) to pass self.docker.
- Update the four
DockerKernel methods above to use self.docker instead of async with closing_async(Docker()) / bare Docker().
prepare_krunner_env_impl is a module-level function — evaluate the caller: if the caller has a shared Docker client handy, accept it as a parameter; otherwise leave it as-is and note why.
- Handle pickling:
DockerKernel has __getstate__/__setstate__ for RPC marshalling — the docker field must NOT be pickled. Exclude it explicitly in __getstate__ and re-inject on restore (the agent can re-attach self.docker when materializing a kernel from persisted state).
Alternative ideas
- Instead of a constructor arg, expose a setter (
kernel.attach_docker(self.docker)) called from DockerAgent. Slightly less explicit, but avoids touching __init__'s signature. Lower preference — constructor injection is clearer.
- Inject via a context var. Rejected — implicit and harder to reason about.
Anything else?
monitor_docker_events() (in agent.py) must keep its own closing_async(Docker()) — it reconnects on stream drops. Do NOT touch.
PersistentServiceContainer in docker/utils.py is bootstrap-time only; not part of this issue.
Parent epic: #11216
Follow-up to: #11218 / #11226
Main idea
PR #11226 routed every per-op
Docker()instantiation through a shared long-lived client except inDockerKernel, because plumbing the shared client through would changeDockerKernel.__init__'s contract. The remaining throwaway connections are:src/ai/backend/agent/docker/kernel.py:121—get_logs()src/ai/backend/agent/docker/kernel.py:224—commit()(bareDocker(), not evenclosing_async— also worth fixing for correctness)src/ai/backend/agent/docker/kernel.py:327—download_file()src/ai/backend/agent/docker/kernel.py:356—download_single()src/ai/backend/agent/docker/kernel.py:486— module-levelprepare_krunner_env_impl()(separate case — not aDockerKernelmethod)Design
docker: Dockeras a keyword-only argument toDockerKernel.__init__, store onself.DockerAgent(agent.py) to passself.docker.DockerKernelmethods above to useself.dockerinstead ofasync with closing_async(Docker())/ bareDocker().prepare_krunner_env_implis a module-level function — evaluate the caller: if the caller has a sharedDockerclient handy, accept it as a parameter; otherwise leave it as-is and note why.DockerKernelhas__getstate__/__setstate__for RPC marshalling — thedockerfield must NOT be pickled. Exclude it explicitly in__getstate__and re-inject on restore (the agent can re-attachself.dockerwhen materializing a kernel from persisted state).Alternative ideas
kernel.attach_docker(self.docker)) called fromDockerAgent. Slightly less explicit, but avoids touching__init__'s signature. Lower preference — constructor injection is clearer.Anything else?
monitor_docker_events()(inagent.py) must keep its ownclosing_async(Docker())— it reconnects on stream drops. Do NOT touch.PersistentServiceContainerindocker/utils.pyis bootstrap-time only; not part of this issue.