Feat/add hermes template by AanningZoeng · Pull Request #9 · yottalabsai/container

AanningZoeng · 2026-04-10T15:30:12Z

Summary

Type of Change

New template
Update to existing template (version bump, new packages, etc.)
Bug fix
Shared infrastructure change (container-template/)
Documentation only

Checklist

Dockerfile builds successfully (docker buildx bake <target> --no-cache)
Container starts and all services are reachable
README.md updated to reflect any interface or version changes
No secrets, tokens, or internal hostnames committed
docker-bake.hcl target name and image tag are correct

Test Notes

I uploaded it to my personal repo on dockerhub and launched a pod to start its service . Everything went well.

Copilot

Pull request overview

This PR introduces a new Hermes Agent official template (vLLM OpenAI-compatible server + hermes-agent CLI) and also updates the existing PyTorch official template’s runtime/entrypoint and image tagging.

Changes:

Added a new official-templates/hermes template with Dockerfile, bake target, and usage documentation.
Updated official-templates/pytorch to change Jupyter defaults/config and switch to a new start script (start1.sh), plus adjusted bake tagging.
Adjusted security-related aspects in PyTorch (removing the ubuntu user) and added additional runtime directory permission setup.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
official-templates/pytorch/Dockerfile	Changes entrypoint/start script, Jupyter config defaults, user handling, and permissions.
official-templates/pytorch/docker-bake.hcl	Updates tag strategy to use `TAG_SUFFIX` and bumps the suffix.
official-templates/hermes/Dockerfile	New image definition: installs PyTorch nightly + vLLM + hermes-agent; adds SSH/Jupyter and startup scripts.
official-templates/hermes/docker-bake.hcl	New build target/tags; introduces variables for versioning/model selection.
official-templates/hermes/README.md	New template documentation for deployment, env vars, endpoints, and troubleshooting.

Comments suppressed due to low confidence (1)

official-templates/pytorch/docker-bake.hcl:6

PR description marks this as “New template” only, but this PR also changes the existing official-templates/pytorch image (Dockerfile + bake tags/entrypoint). Please update the PR description/type-of-change checklist to reflect the pytorch modifications, or split the pytorch changes into a separate PR to keep review/release scope clear.

variable "PUBLISHER"  { default = "yottalabsai" }
variable "TAG_SUFFIX" { default = "2026031701" }

group "default" {
  targets = ["pytorch290"]
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T15:42:38Z

+COPY --from=scripts start1.sh /start1.sh
+RUN chmod 755 /start1.sh && \
+    sed -i 's/\r$//' /start1.sh


COPY --from=scripts start1.sh /start1.sh will fail because the referenced build context (../../container-template) only contains start.sh (no start1.sh). This makes the image non-buildable with the current bake contexts. Either add start1.sh to container-template/ (and ensure it’s included in the scripts context) or switch back to copying start.sh (and update it to not assume an ubuntu user).

Copilot · 2026-04-10T15:42:39Z

 # ===============================
 WORKDIR /
-RUN mkdir -p /workspace && chmod 777 /workspace
+RUN mkdir -p /workspace && chmod 777 /workspace /root


chmod 777 /root makes the root home directory world-writable, which is a significant security risk and can also cause OpenSSH StrictModes checks to reject key-based logins. Limit permissions to the specific writable paths needed (e.g., /workspace, /root/.jupyter, etc.) and keep /root non-writable by group/others.

Suggested change

RUN mkdir -p /workspace && chmod 777 /workspace /root

RUN mkdir -p /workspace && chmod 777 /workspace && chmod 700 /root

Copilot · 2026-04-10T15:42:39Z

@@ -22,13 +22,13 @@ ENV DEBIAN_FRONTEND=noninteractive \
    SHELL=/bin/bash \
    PATH=/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/bin:$PATH \


The image no longer installs Miniconda, but PATH still prepends /opt/conda/bin. This makes the environment misleading and can mask issues if a different python/tools are expected from conda. Consider removing /opt/conda/bin from PATH (or reintroduce conda if it’s still required).

Suggested change

PATH=/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/bin:$PATH \

PATH=/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/bin:$PATH \

Copilot · 2026-04-10T15:42:39Z

+RUN mkdir -p /var/log/supervisor /usr/local/bin && \
+    chmod 777 /var/log/supervisor /workspace /var/run /var/lib/nginx && \
+    mkdir -p /run/sshd && \


chmod 777 on /var/run and /var/lib/nginx is overly permissive (world-writable runtime/state directories). Since the container runs as root, this shouldn’t be necessary and increases the blast radius if any process is compromised. Prefer the minimum permissions needed (e.g., only chown/chmod the specific pid/log dirs that must be writable).

Suggested change

RUN mkdir -p /var/log/supervisor /usr/local/bin && \

chmod 777 /var/log/supervisor /workspace /var/run /var/lib/nginx && \

mkdir -p /run/sshd && \

RUN mkdir -p /var/log/supervisor /var/lib/nginx /workspace /usr/local/bin /run/sshd && \

chmod 755 /var/log/supervisor /var/lib/nginx && \

chmod 777 /workspace && \

Copilot · 2026-04-10T15:42:39Z


  args = {
-    BASE_IMAGE     = "nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04"
+    BASE_IMAGE     = "nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04"        


There’s trailing whitespace after the BASE_IMAGE value, which can cause noisy diffs and lint failures in some setups. Please remove the extra spaces at the end of the line.

Suggested change

BASE_IMAGE = "nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04"

BASE_IMAGE = "nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04"

Copilot · 2026-04-10T15:42:40Z

+    printenv | grep -E '^YOTTA_|^PATH=|^_=' | awk -F = '{ print "export " $1 "=\"" $2 "\"" }' >> /etc/rp_environment
+    echo 'export PATH=/usr/local/nvidia/bin:/usr/local/cuda-12.8/bin:~/.local/bin:$PATH' >> /etc/rp_environment
+    echo 'source /etc/rp_environment' >> ~/.bashrc


export_env_vars appends (>>) to /etc/rp_environment and ~/.bashrc every container start, so restarts will continually duplicate exports/source lines. Make this idempotent by overwriting /etc/rp_environment (use >), and only adding the source line to ~/.bashrc if it isn’t already present.

Suggested change

printenv | grep -E '^YOTTA_|^PATH=|^_=' | awk -F = '{ print "export " $1 "=\"" $2 "\"" }' >> /etc/rp_environment

echo 'export PATH=/usr/local/nvidia/bin:/usr/local/cuda-12.8/bin:~/.local/bin:$PATH' >> /etc/rp_environment

echo 'source /etc/rp_environment' >> ~/.bashrc

printenv | grep -E '^YOTTA_|^PATH=|^_=' | awk -F = '{ print "export " $1 "=\"" $2 "\"" }' > /etc/rp_environment

echo 'export PATH=/usr/local/nvidia/bin:/usr/local/cuda-12.8/bin:~/.local/bin:$PATH' >> /etc/rp_environment

grep -qxF 'source /etc/rp_environment' ~/.bashrc 2>/dev/null || echo 'source /etc/rp_environment' >> ~/.bashrc

Copilot · 2026-04-10T15:42:41Z

+RUN sed -i "s/#PasswordAuthentication yes/PasswordAuthentication yes/" /etc/ssh/sshd_config && \
+    sed -i "s/PasswordAuthentication no/PasswordAuthentication yes/" /etc/ssh/sshd_config && \
+    sed -i "s/#PermitRootLogin prohibit-password/PermitRootLogin yes/" /etc/ssh/sshd_config


The SSH hardening here enables PasswordAuthentication yes and PermitRootLogin yes. Since the README describes SSH access via PUBLIC_KEY, enabling password auth/root login is unnecessary and increases exposure (brute force / misconfiguration risk). Prefer key-only auth (PasswordAuthentication no) and a stricter PermitRootLogin setting (e.g., prohibit-password), unless there’s a documented requirement for password login.

Suggested change

RUN sed -i "s/#PasswordAuthentication yes/PasswordAuthentication yes/" /etc/ssh/sshd_config && \

sed -i "s/PasswordAuthentication no/PasswordAuthentication yes/" /etc/ssh/sshd_config && \

sed -i "s/#PermitRootLogin prohibit-password/PermitRootLogin yes/" /etc/ssh/sshd_config

RUN sed -i "s/#PasswordAuthentication yes/PasswordAuthentication no/" /etc/ssh/sshd_config && \

sed -i "s/PasswordAuthentication yes/PasswordAuthentication no/" /etc/ssh/sshd_config && \

sed -i "s/#PermitRootLogin prohibit-password/PermitRootLogin prohibit-password/" /etc/ssh/sshd_config

Copilot · 2026-04-10T15:42:41Z

+RUN set -eux; \
+    git clone --depth 1 https://github.com/NousResearch/hermes-agent.git /tmp/hermes-agent; \
+    cd /tmp/hermes-agent; \
+    pip install --no-cache-dir . ; \
+    mkdir -p "$HERMES_HOME"; \
+    chmod 777 "$HERMES_HOME"; \
+    rm -rf /tmp/hermes-agent


hermes-agent is installed from a shallow clone of the repo’s default branch, which makes builds non-reproducible and can break unexpectedly if upstream changes. Consider pinning to a specific tag/commit (or a published PyPI version if available) and using that in the build so images are repeatable.

Copilot · 2026-04-10T15:42:41Z

+RUN mkdir -p /root/.jupyter && printf '%s\n' \
+    'c.ServerApp.token = "yotta"' \
+    'c.ServerApp.password = ""' \
+    'c.ServerApp.allow_remote_access = True' \
+    'c.ServerApp.allow_origin = "*"' \
+    'c.NotebookApp.token = "yotta"' \
+    'c.NotebookApp.password = ""' \
+    'c.NotebookApp.allow_remote_access = True' \


Jupyter is configured for remote access with a fixed, well-known token ("yotta") and an empty password. If this container can be reached from untrusted networks, this is easy to guess and increases risk of unauthorized access. Consider generating a random token when JUPYTER_PASSWORD isn’t explicitly set (or disabling Jupyter by default) and avoid allow_origin="*" unless required.

Copilot · 2026-04-10T15:42:41Z

+# Configure JupyterLab: auto-login with token (no password prompt)
+# ===============================
+RUN mkdir -p /root/.jupyter && printf '%s\n' \
+    'c.ServerApp.token = "yotta"' \
+    'c.ServerApp.password = ""' \
+    'c.ServerApp.allow_remote_access = True' \
+    'c.ServerApp.allow_origin = "*"' \
+    'c.NotebookApp.token = "yotta"' \


Jupyter is configured for remote access with a fixed, well-known token ("yotta") and an empty password. If the container is reachable outside a trusted boundary, this makes unauthorized access much easier. Consider requiring an explicit JUPYTER_PASSWORD/token at runtime (no default) or generating a random token on first start, and avoid allow_origin="*" unless it’s required.

dadaism

The hermes template is a solid foundation — the README is thorough and the vLLM startup/health-check logic is well done.

Main concerns are:

the VLLM_MAX_MODEL_LEN inconsistency between Dockerfile and README
world-readable secrets (chmod 777)
unpinned dependencies (hermes-agent, vllm)
the unused VLLM_VERSION variable
the hadolint bypass.

The pytorch changes should be split into their own PR.

dadaism · 2026-04-15T08:44:55Z

+    CMAKE_CUDA_ARCHITECTURES="120" \
+    PYTORCH_ALLOC_CONF="expandable_segments:True" \
+    PYTHON_VERSION="3.11" \
+    JUPYTER_PASSWORD=yotta \


Is it possible to generate random secrets and don't use hardcoded one?

dadaism · 2026-04-15T08:45:55Z

+    CUDA_HOME=/usr/local/cuda \
+    PATH=/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/bin:$PATH \
+    LD_LIBRARY_PATH=/usr/local/nvidia/lib64:$LD_LIBRARY_PATH \
+    TORCH_CUDA_ARCH_LIST="12.0" \


Only for Blackwell? Is it possible to support "8.0;9.0;12.0" as well

dadaism · 2026-04-15T08:46:56Z

+    cd /tmp/hermes-agent; \
+    pip install --no-cache-dir . ; \
+    mkdir -p "$HERMES_HOME"; \
+    chmod 777 "$HERMES_HOME"; \


700 is sufficient

dadaism · 2026-04-15T08:47:32Z

+    VLLM_LOG="/workspace/vllm.log" \
+    OPENAI_BASE_URL=http://localhost:8000/v1
+
+RUN mkdir -p /workspace "$HF_HOME" && chmod -R 777 /workspace "$HF_HOME"


Can we use 700?

dadaism · 2026-04-15T08:48:11Z

+ENV HERMES_HOME=/root/.hermes
+
+RUN set -eux; \
+    git clone --depth 1 https://github.com/NousResearch/hermes-agent.git /tmp/hermes-agent; \


Pin to a specific commit or release tag?

dadaism · 2026-04-15T08:51:18Z

+# vLLM ecosystem (latest stable — same pattern as OpenClaw)
+# ==============================
+RUN python -m pip install --no-cache-dir \
+      vllm transformers sentencepiece "huggingface_hub[cli]" hf_transfer einops \


vllm==${VLLM_VERSION}
lock the version?

dadaism · 2026-04-15T08:53:34Z

    && rm -f /tmp/Python.tgz \
    && cd /tmp/python-src \
-    && ./configure --enable-optimizations --with-ensurepip=install \
+    && ./configure --with-ensurepip=install \


Please add --enable-optimizations back

AanningZoeng added 3 commits March 17, 2026 02:43

update

e2b5627

Merge branch 'main' of github.com:yottalabsai/container

632016c

feat: add hermes template (vLLM + hermes-agent, Blackwell sm_120)

29c3acc

AanningZoeng requested review from Copilot and dadaism April 10, 2026 15:30

Copilot started reviewing on behalf of AanningZoeng April 10, 2026 15:31 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

ci: exclude hermes Dockerfile from hadolint (uses heredocs)

ed31eaf

dadaism requested changes Apr 15, 2026

View reviewed changes

This was referenced Apr 15, 2026

Feat/update skyrl template #11

Open

Feat/update unsloth template #12

Open

	RUN mkdir -p /workspace && chmod 777 /workspace /root
	RUN mkdir -p /workspace && chmod 777 /workspace && chmod 700 /root

		@@ -22,13 +22,13 @@ ENV DEBIAN_FRONTEND=noninteractive \
		SHELL=/bin/bash \
		PATH=/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/bin:$PATH \

	PATH=/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/bin:$PATH \
	PATH=/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/bin:$PATH \

	BASE_IMAGE = "nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04"
	BASE_IMAGE = "nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04"

Conversation

AanningZoeng commented Apr 10, 2026

Summary

Type of Change

Checklist

Test Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

dadaism left a comment

Choose a reason for hiding this comment

Uh oh!

dadaism Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

dadaism Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

dadaism Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

dadaism Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

dadaism Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

dadaism Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

dadaism Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants