Skip to content

Development: Optimize Docker build times and image sizes#448

Open
wasnertobias wants to merge 2 commits intomainfrom
chore/optimize-docker-builds
Open

Development: Optimize Docker build times and image sizes#448
wasnertobias wants to merge 2 commits intomainfrom
chore/optimize-docker-builds

Conversation

@wasnertobias
Copy link
Copy Markdown
Member

@wasnertobias wasnertobias commented Mar 3, 2026

Optimize Docker Build Times and Image Sizes

Summary

This PR optimizes Docker builds across the edutelligence monorepo by switching to slim base images, adding multi-stage builds, improving .dockerignore coverage, and migrating iris to uv.

Changes

1. Slim Base Images for Athena (12 Dockerfiles)

  • Switched all Athena Dockerfiles from python:3.11 (~900MB) to python:3.11-slim (~150MB)
  • Added build-essential, gcc, and libpq-dev in builder stages for C extension compilation (psycopg2)
  • Added libpq5 runtime dependency in production stages

2. Multi-Stage Builds (14 Dockerfiles)

  • Athena modules (10 Dockerfiles): Builder stage installs deps with Poetry into an in-project venv; runtime stage copies only the .venv — no Poetry, no compilers, no dev tools in production
  • Nebula FAQ & Transcriber: Added multi-stage builds to separate build-time from runtime dependencies
  • Iris: Added multi-stage build with uv
  • module_text_cofee: Extended from 2-stage (protobuf + runtime) to 3-stage (protobuf + builder + runtime)

3. Poetry → uv Migration (iris/Dockerfile)

  • Replaced Poetry with uv (10-100x faster dependency resolution)
  • Follows the same pattern as logos/Dockerfile which already uses uv successfully
  • Uses --mount=type=cache for persistent dependency cache across builds

4. Comprehensive .dockerignore Files (15 files)

  • Updated 8 existing minimal .dockerignore files (were just 4 lines) to comprehensive 27-line templates
  • Created 4 new .dockerignore files for Athena modules that had none
  • Created 3 BuildKit per-Dockerfile .dockerignore files (Dockerfile.dockerignore) for services using the monorepo root as build context (iris, nebula faq, nebula transcriber)
  • No root-level .dockerignore — avoids breaking sibling services that share the same build context

5. Runtime Improvements

  • Switched from shell-form CMD poetry run python -m module_* to exec-form CMD ["python", "-m", "module_name"] for proper signal handling
  • Removed Poetry from all production images (runs directly from venv)

What Was NOT Changed (by design)

  • logos/Dockerfile, logos/logos-ui/Dockerfile, logos/logos-landing/Dockerfile — already optimized
  • atlas/AtlasMl/Dockerfile — already uses multi-stage + slim
  • Docker Compose files, CI/CD workflows, application code
  • Athena COPY --from=athena / COPY --from=llm_core cross-image dependency chain

Testing

All 15 modified Dockerfiles were built and verified locally:

  • athena/athena/athena base image (slim)
  • athena/llm_core/llm_core base image (slim, depends on athena)
  • athena/assessment_module_manager/ → multi-stage (depends on athena)
  • athena/log_viewer/ → multi-stage
  • athena/modules/programming/module_example/ → multi-stage (depends on athena)
  • athena/modules/programming/module_programming_apted/ → multi-stage
  • athena/modules/programming/module_programming_llm/ → multi-stage (depends on athena + llm_core)
  • athena/modules/programming/module_programming_themisml/ → multi-stage (Poetry 1.6.1, PyTorch)
  • athena/modules/programming/module_programming_winnowing/ → multi-stage
  • athena/modules/text/module_text_cofee/ → 3-stage (protobuf + builder + runtime)
  • athena/modules/text/module_text_llm/ → multi-stage
  • athena/modules/modeling/module_modeling_llm/ → multi-stage
  • iris/Dockerfile → multi-stage with uv (context: repo root)
  • nebula/docker/faq/Dockerfile → multi-stage (context: repo root)
  • nebula/docker/transcriber/Dockerfile → multi-stage (context: repo root)

Expected Impact

Metric Before After Improvement
Athena module image size ~1GB+ ~289MB ~70% smaller
Iris dep install time Poetry (~60s) uv (~6s) ~10x faster
Build context size (iris) Full monorepo Only iris/ + memiris/ ~95% smaller
Production attack surface gcc, Poetry, pip, dev tools Runtime only Significantly reduced

Summary by CodeRabbit

Release Notes

  • Chores
    • Optimized Docker builds across services for faster deployments and smaller runtime images
    • Improved build efficiency through enhanced build context configuration

- Switch all Athena Dockerfiles from python:3.11 (~900MB) to python:3.11-slim (~150MB)
- Add multi-stage builds to all Athena module Dockerfiles (12 images):
  builder stage installs deps with Poetry; runtime stage copies only the venv
- Add multi-stage builds to nebula FAQ and transcriber Dockerfiles
- Switch iris/Dockerfile from Poetry to uv (10-100x faster dep resolution)
  with multi-stage build pattern matching logos/Dockerfile
- Add comprehensive .dockerignore files to all Athena modules (new or updated)
- Add BuildKit per-Dockerfile .dockerignore for iris and nebula services
  (using Dockerfile.dockerignore pattern since context is monorepo root)
- Remove Poetry, compilers, and dev tools from all production runtime images
- Use exec-form CMD (JSON array) instead of shell-form for proper signal handling
- Preserve COPY --from=athena and COPY --from=llm_core cross-image dependency chain

Expected impact:
- ~600MB+ reduction per Athena module image (18 images)
- Faster builds via uv, better layer caching, smaller build contexts
- Improved security: no gcc/build-essential/Poetry in production images
Copilot AI review requested due to automatic review settings March 3, 2026 16:42
@wasnertobias wasnertobias requested review from a team as code owners March 3, 2026 16:42
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 3, 2026

📝 Walkthrough

Walkthrough

The pull request refactors Docker builds across the monorepo to use multi-stage builds, moving dependency installation to a builder stage. Additionally, .dockerignore files are created or expanded to exclude development artifacts, caches, and build outputs.

Changes

Cohort / File(s) Summary
Assessment Module Manager
athena/assessment_module_manager/.dockerignore, athena/assessment_module_manager/Dockerfile
.dockerignore expanded to exclude development/build artifacts. Dockerfile refactored to multi-stage build with builder stage installing dependencies into in-project venv, runtime stage using slim base and copying pre-built venv; CMD changed from poetry to direct Python module invocation.
Athena Core
athena/athena/.dockerignore, athena/athena/Dockerfile
.dockerignore expanded with comprehensive artifact exclusions. Dockerfile updated to slim base image and adds build dependency installation step.
LLM Core
athena/llm_core/.dockerignore, athena/llm_core/Dockerfile
New .dockerignore file introduced with standard artifact patterns. Dockerfile updated to slim base and adds system dependency installation for C extensions.
Log Viewer
athena/log_viewer/.dockerignore, athena/log_viewer/Dockerfile
.dockerignore expanded with development artifact patterns. Dockerfile refactored to multi-stage build with builder creating in-project venv, runtime stage using pre-built dependencies; CMD updated to direct Python invocation.
Module Modeling LLM
athena/modules/modeling/module_modeling_llm/.dockerignore, athena/modules/modeling/module_modeling_llm/Dockerfile
New .dockerignore file added. Dockerfile refactored to multi-stage build with builder stage installing dependencies into virtual environment; runtime copies venv and uses direct Python module execution.
Module Example
athena/modules/programming/module_example/.dockerignore, athena/modules/programming/module_example/Dockerfile
.dockerignore expanded significantly. Dockerfile converted to multi-stage build with system dependencies in builder, in-project venv creation, and direct module invocation in runtime stage.
Module Programming Apted
athena/modules/programming/module_programming_apted/.dockerignore, athena/modules/programming/module_programming_apted/Dockerfile
.dockerignore expanded with comprehensive patterns. Dockerfile refactored to multi-stage with builder-based dependency installation and in-project venv usage in runtime.
Module Programming LLM
athena/modules/programming/module_programming_llm/.dockerignore, athena/modules/programming/module_programming_llm/Dockerfile
New .dockerignore file added. Dockerfile refactored to multi-stage build with in-project venv creation in builder and direct Python module execution in runtime.
Module Programming ThemisML
athena/modules/programming/module_programming_themisml/.dockerignore, athena/modules/programming/module_programming_themisml/Dockerfile
.dockerignore expanded with comprehensive artifact and directory patterns. Dockerfile converted to multi-stage with Poetry configured for in-project venv in builder, runtime uses copied venv.
Module Programming Winnowing
athena/modules/programming/module_programming_winnowing/.dockerignore, athena/modules/programming/module_programming_winnowing/Dockerfile
.dockerignore expanded with development artifact patterns. Dockerfile refactored to multi-stage with builder creating in-project venv, runtime uses copied venv and direct module invocation.
Module Text Cofee
athena/modules/text/module_text_cofee/.dockerignore, athena/modules/text/module_text_cofee/Dockerfile
.dockerignore expanded and module-specific patterns added. Dockerfile refactored to multi-stage (protobuf builder, dependency builder, runtime) with in-project venv creation and management across stages.
Module Text LLM
athena/modules/text/module_text_llm/.dockerignore, athena/modules/text/module_text_llm/Dockerfile
New .dockerignore file added. Dockerfile converted to multi-stage with explicit Poetry setup in builder creating in-project venv; runtime uses copied venv and direct module execution.
Iris
iris/Dockerfile.dockerignore, iris/Dockerfile
New .dockerignore file introduced with monorepo-aware exclusion patterns. Dockerfile refactored to two-stage build using uv for dependency resolution instead of Poetry; builder creates venv, runtime executes direct uvicorn command.
Nebula FAQ
nebula/docker/faq/Dockerfile.dockerignore, nebula/docker/faq/Dockerfile
New .dockerignore file with monorepo and artifact patterns. Dockerfile converted to two-stage build with dependency installation in builder, venv copied to runtime, direct uvicorn invocation replaces poetry-based command.
Nebula Transcriber
nebula/docker/transcriber/Dockerfile.dockerignore, nebula/docker/transcriber/Dockerfile
New .dockerignore file introduced. Dockerfile refactored to multi-stage with builder creating venv and installing dependencies, runtime copies venv and runs uvicorn directly; system dependencies optimized with --no-install-recommends.

Poem

🐰 Docker builds now split in two—
Builder stage crafts what we need anew,
Runtime stage, slim and lean,
The fastest, cleanest images we've seen!
Virtual envs pre-built with care,
No Poetry at runtime—less to bear! 🎉

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check ✅ Passed The title accurately summarizes the main change: optimizing Docker build times and image sizes through the widespread use of multi-stage builds, slim base images, and improved .dockerignore patterns.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch chore/optimize-docker-builds

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Optimizes Docker build performance and runtime image sizes across the monorepo by introducing multi-stage builds, slimmer base images, improved ignore rules, and switching Iris to uv-based dependency installation.

Changes:

  • Converted multiple Python services/modules to multi-stage Docker builds with slim runtime stages and exec-form CMD.
  • Added service-specific .dockerignore / Dockerfile.dockerignore files to shrink build contexts.
  • Migrated iris Docker build from Poetry to uv with BuildKit cache mounts.

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
nebula/docker/transcriber/Dockerfile.dockerignore Limits monorepo-root build context to transcriber-relevant paths
nebula/docker/transcriber/Dockerfile Multi-stage build; runs uvicorn from venv without Poetry at runtime
nebula/docker/faq/Dockerfile.dockerignore Limits monorepo-root build context to faq-relevant paths
nebula/docker/faq/Dockerfile Multi-stage build; runs uvicorn from venv without Poetry at runtime
iris/Dockerfile.dockerignore Limits monorepo-root build context to Iris-relevant paths
iris/Dockerfile Multi-stage build using uv to speed dependency installation
athena/modules/text/module_text_llm/Dockerfile Multi-stage build; slim runtime; venv-only copy
athena/modules/text/module_text_llm/.dockerignore Broader ignores to reduce build context size
athena/modules/text/module_text_cofee/Dockerfile Expanded to 3 stages (protobuf + deps builder + runtime)
athena/modules/text/module_text_cofee/.dockerignore Broader ignores; keeps protobuf dir out of context
athena/modules/programming/module_programming_winnowing/Dockerfile Multi-stage build; slim runtime; venv-only copy
athena/modules/programming/module_programming_winnowing/.dockerignore Broader ignores to reduce build context size
athena/modules/programming/module_programming_themisml/Dockerfile Multi-stage build; slim runtime; venv-only copy
athena/modules/programming/module_programming_themisml/.dockerignore Broader ignores to reduce build context size
athena/modules/programming/module_programming_llm/Dockerfile Multi-stage build; slim runtime; venv-only copy
athena/modules/programming/module_programming_llm/.dockerignore Broader ignores to reduce build context size
athena/modules/programming/module_programming_apted/Dockerfile Multi-stage build; slim runtime; venv-only copy
athena/modules/programming/module_programming_apted/.dockerignore Broader ignores to reduce build context size
athena/modules/programming/module_example/Dockerfile Multi-stage build; slim runtime; venv-only copy
athena/modules/programming/module_example/.dockerignore Broader ignores to reduce build context size
athena/modules/modeling/module_modeling_llm/Dockerfile Multi-stage build; slim runtime; venv-only copy
athena/modules/modeling/module_modeling_llm/.dockerignore Broader ignores to reduce build context size
athena/log_viewer/Dockerfile Multi-stage build; slim runtime; venv-only copy
athena/log_viewer/.dockerignore Broader ignores to reduce build context size
athena/llm_core/Dockerfile Switches llm_core base to slim and adds build deps for C extensions
athena/llm_core/.dockerignore Adds comprehensive ignores to reduce build context size
athena/athena/Dockerfile Switches athena base to slim and adds build deps for C extensions
athena/athena/.dockerignore Adds comprehensive ignores to reduce build context size
athena/assessment_module_manager/Dockerfile Multi-stage build; slim runtime; venv-only copy
athena/assessment_module_manager/.dockerignore Broader ignores to reduce build context size

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +48 to +51
# Copy only the virtual environment from the builder
COPY --from=builder /code/.venv /code/.venv
# Copy the generated protobuf file
COPY --from=builder /code/module_text_cofee/protobuf/ /code/module_text_cofee/protobuf/
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cofee.protois generated in thecofee_protobufstage, but the runtime stage copies protobuf artifacts frombuilder. As shown, builderdoesn't appear to be the stage producing those protobuf outputs, so this copy is likely to be missing/empty or fail depending on how the rest of the Dockerfile is structured. Copy the generated protobuf outputs from thecofee_protobufstage (or ensure the builder stage explicitly imports them fromcofee_protobuf` before the final runtime copy).

Suggested change
# Copy only the virtual environment from the builder
COPY --from=builder /code/.venv /code/.venv
# Copy the generated protobuf file
COPY --from=builder /code/module_text_cofee/protobuf/ /code/module_text_cofee/protobuf/
# Ensure protobuf destination directory exists
RUN mkdir -p /code/module_text_cofee/protobuf
# Copy only the virtual environment from the builder
COPY --from=builder /code/.venv /code/.venv
# Copy the generated protobuf file directly from the cofee_protobuf stage
COPY --from=cofee_protobuf /proto/cofee_pb2.py /code/module_text_cofee/protobuf/

Copilot uses AI. Check for mistakes.
Comment on lines +21 to 25
# Runtime system dependencies (ffmpeg, tesseract, OpenCV libs)
RUN apt-get update && apt-get install -y --no-install-recommends \
ffmpeg \
tesseract-ocr \
libgl1 \
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runtime apt-get install layer doesn’t remove /var/lib/apt/lists/*, which increases final image size and can reduce build cache efficiency. Add cleanup to the same RUN instruction (similar to the Athena Dockerfiles in this PR).

Copilot uses AI. Check for mistakes.
Comment on lines +10 to +14
RUN pip install --no-cache-dir poetry \
&& poetry config virtualenvs.create true \
&& poetry config virtualenvs.in-project true \
&& poetry install --only main --no-root \
&& pip uninstall -y poetry
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pip install ... poetry is unpinned, which makes builds less reproducible and can introduce unexpected breakages when Poetry releases. Pin Poetry to a specific version (preferably aligned with the repo’s other Dockerfiles, which already pin Poetry in several places).

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +16
RUN pip install --no-cache-dir poetry \
&& poetry config virtualenvs.create true \
&& poetry config virtualenvs.in-project true \
&& poetry install --only main --no-root \
&& pip uninstall -y poetry
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as FAQ: Poetry is installed without a version pin, reducing reproducibility of the container build. Pin Poetry to a specific version to avoid surprise upgrades.

Copilot uses AI. Check for mistakes.
# Copy dependency metadata first (for layer caching)
COPY iris/pyproject.toml iris/poetry.lock ./

# Copy memiris (path dependency required for resolution)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COPY memiris ../memiris places the dependency at /memiris (outside WORKDIR /app), which is surprising and makes the image layout harder to reason about. Prefer copying into a path under /app (e.g., /app/memiris) and updating the path dependency accordingly, or otherwise document why /memiris is intentional.

Suggested change
# Copy memiris (path dependency required for resolution)
# Copy memiris (path dependency required for resolution).
# Note: pyproject.toml expects memiris at ../memiris relative to /app,
# so we intentionally place it at /memiris (outside WORKDIR /app).

Copilot uses AI. Check for mistakes.

# Stage 2: Runtime — slim image without compilers or dev tools
FROM python:3.11-slim
LABEL org.opencontainers.image.source=https://github.com/pal03377/Athena-New
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The org.opencontainers.image.source label differs from the other Athena images in this PR (which point to https://github.com/ls1intum/Athena). If this module belongs to the main repo, update the label to the canonical source URL to keep image provenance consistent.

Suggested change
LABEL org.opencontainers.image.source=https://github.com/pal03377/Athena-New
LABEL org.opencontainers.image.source=https://github.com/ls1intum/Athena

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
iris/Dockerfile (1)

30-48: ⚠️ Potential issue | 🟠 Major

Run the runtime stage as a non-root user.

The runtime stage (lines 30-48) has no USER directive, so the container runs as root. This weakens runtime isolation and contradicts Docker security best practices.

🛡️ Minimal non-root runtime patch
 FROM python:3.12.3-slim
 
 ENV PYTHONUNBUFFERED=1
 
 WORKDIR /app
+RUN addgroup --system app && adduser --system --ingroup app app
 
 # Copy only the virtual environment from the builder
-COPY --from=builder /app/.venv /app/.venv
+COPY --from=builder --chown=app:app /app/.venv /app/.venv
 ENV VIRTUAL_ENV=/app/.venv PATH="/app/.venv/bin:$PATH"
 
 # Copy the content of the local src directory to the working directory
-COPY iris/src/iris ./iris
+COPY --chown=app:app iris/src/iris ./iris
 
 # Copy logging configuration
-COPY iris/log_conf.yml ./log_conf.yml
+COPY --chown=app:app iris/log_conf.yml ./log_conf.yml
+USER app
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/Dockerfile` around lines 30 - 48, The Dockerfile runtime stage runs as
root; create a non-root user and switch to it: add steps to create a user/group
(e.g., appuser), chown the WORKDIR and the copied VIRTUAL_ENV (/app and
/app/.venv) to that user, and add a USER directive before the CMD so uvicorn
(iris.main:app) runs unprivileged; reference the existing WORKDIR /app, COPY
--from=builder /app/.venv, ENV VIRTUAL_ENV, and CMD ["uvicorn", "iris.main:app",
...] when applying the changes.
🧹 Nitpick comments (10)
athena/modules/programming/module_programming_winnowing/Dockerfile (1)

22-24: Consider excluding dev dependencies to further reduce image size.

Since this is a production runtime image, dev dependencies (test frameworks, linters, type checkers, etc.) are unnecessary. Adding --only main will exclude them from the venv that gets copied to the runtime stage.

♻️ Proposed fix
 RUN poetry config virtualenvs.create true \
     && poetry config virtualenvs.in-project true \
-    && poetry install --no-interaction --no-ansi
+    && poetry install --only main --no-interaction --no-ansi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@athena/modules/programming/module_programming_winnowing/Dockerfile` around
lines 22 - 24, Update the Dockerfile RUN step that calls poetry install (the RUN
line containing "poetry install --no-interaction --no-ansi") to exclude
development dependencies by adding the appropriate flag (e.g., --only main) so
the created virtualenv contains only production deps; edit the RUN command that
configures poetry and runs "poetry install" to append the flag and rebuild the
image to reduce runtime image size.
athena/modules/programming/module_programming_themisml/Dockerfile (1)

9-11: Minor: gcc is redundant when build-essential is installed.

The build-essential meta-package already includes gcc, so listing it separately is unnecessary.

🔧 Suggested fix
 RUN apt-get update && apt-get install -y --no-install-recommends \
-    build-essential gcc libpq-dev \
+    build-essential libpq-dev \
     && rm -rf /var/lib/apt/lists/*
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@athena/modules/programming/module_programming_themisml/Dockerfile` around
lines 9 - 11, In the Dockerfile RUN line that installs packages (the command
containing "apt-get update && apt-get install -y --no-install-recommends \
build-essential gcc libpq-dev \ && rm -rf /var/lib/apt/lists/*"), remove the
redundant "gcc" package from the install list because it is provided by
"build-essential"; leave "build-essential" and "libpq-dev" and keep the existing
cleanup (rm -rf) unchanged.
nebula/docker/faq/Dockerfile (1)

1-35: Well-structured multi-stage build.

The builder/runtime separation is clean: dependencies are installed in an in-project venv, poetry is removed after use, and only the venv is carried forward to the slim runtime image.

Consider adding a non-root user for defense-in-depth.

Running as root inside the container increases the blast radius if the application is compromised. Adding a dedicated user is a recommended hardening measure, though it may already be handled by your orchestration layer (e.g., Kubernetes securityContext).

🔒 Optional: Add non-root user
 # Copy only the virtual environment from the builder
 COPY --from=builder /app/.venv /app/.venv
 ENV PATH="/app/.venv/bin:$PATH"

 # Copy source code and config
 COPY nebula/src/nebula ./nebula

+# Run as non-root user
+RUN useradd --create-home appuser
+USER appuser
+
 # Set PYTHONPATH for relative imports
 ENV PYTHONPATH=/app
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nebula/docker/faq/Dockerfile` around lines 1 - 35, The Dockerfile currently
runs the image as root; add a non-root user and ensure /app is owned by that
user before switching to it to harden runtime. In the Dockerfile
(builder/runtime stages), create a dedicated user/group (e.g.,
appuser/appgroup), chown the application files and the copied virtualenv at /app
to that uid/gid, and add a USER instruction so the final runtime executes
uvicorn as that non-root user instead of root.
nebula/docker/transcriber/Dockerfile (1)

1-46: Clean multi-stage build with appropriate runtime dependencies.

The separation is well done: build tools stay in the builder stage while only ffmpeg, tesseract-ocr, and OpenCV runtime libraries are installed in the final image. The exec-form CMD is correct.

Same recommendation as the FAQ service: consider adding a non-root user for defense-in-depth if not already handled by your orchestration layer.

,

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nebula/docker/transcriber/Dockerfile` around lines 1 - 46, Add a non-root
runtime user to the final image to improve container security: create a
user/group (e.g., "nebula"), chown /app and any runtime dirs to that user after
copying the venv and source (refer to the COPY --from=builder /app/.venv
/app/.venv, COPY nebula/src/nebula ./nebula and WORKDIR /app lines), then switch
to that user with USER before the EXPOSE/CMD lines so uvicorn
(nebula.transcript.app:app) runs unprivileged; ensure PATH and PYTHONPATH remain
correct for the in-project venv.
athena/modules/text/module_text_cofee/Dockerfile (2)

31-34: Consider adding --no-root to poetry install.

At this point in the build, only pyproject.toml and poetry.lock are present—the module source code hasn't been copied yet. If the project is defined as an installable package, poetry install will attempt to install the root package, which could fail or produce unexpected behavior.

Adding --no-root ensures only dependencies are installed in the builder stage, which is the intended caching pattern.

♻️ Proposed fix
 RUN poetry config virtualenvs.create true \
     && poetry config virtualenvs.in-project true \
-    && poetry install --no-interaction --no-ansi
+    && poetry install --no-interaction --no-ansi --no-root
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@athena/modules/text/module_text_cofee/Dockerfile` around lines 31 - 34, The
RUN line that runs Poetry should install only dependencies in the build stage;
update the command that currently calls "poetry install --no-interaction
--no-ansi" to include the --no-root flag so Poetry does not try to install the
project itself (i.e., change the RUN block containing the poetry config lines to
run "poetry install --no-root --no-interaction --no-ansi").

48-55: Reorder COPY to ensure generated protobuf takes precedence.

Currently, COPY . ./ (line 55) runs after copying the generated protobuf (line 51). If cofee_pb2.py exists in the build context (accidentally committed or left from local generation), it would overwrite the freshly-generated version from Stage 1.

Swapping the order ensures the generated file always wins.

♻️ Proposed fix
 WORKDIR /code

-# Copy only the virtual environment from the builder
-COPY --from=builder /code/.venv /code/.venv
-# Copy the generated protobuf file
-COPY --from=builder /code/module_text_cofee/protobuf/ /code/module_text_cofee/protobuf/
-ENV PATH="/code/.venv/bin:$PATH"
-
 # Project files
 COPY . ./
+
+# Copy only the virtual environment from the builder
+COPY --from=builder /code/.venv /code/.venv
+# Copy the generated protobuf file (after project files to ensure it takes precedence)
+COPY --from=builder /code/module_text_cofee/protobuf/ /code/module_text_cofee/protobuf/
+ENV PATH="/code/.venv/bin:$PATH"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@athena/modules/text/module_text_cofee/Dockerfile` around lines 48 - 55, The
Dockerfile currently copies the full project (COPY . ./) after copying the
generated protobufs, which allows any cofee_pb2.py accidentally present in the
build context to overwrite the generated files; to fix, swap the COPY order so
project files are copied first (COPY . ./) and then copy the generated protobuf
directory from the builder stage (COPY --from=builder
/code/module_text_cofee/protobuf/ /code/module_text_cofee/protobuf/) ensuring
the generated protobufs (e.g., cofee_pb2.py) in the builder take precedence over
files in the build context.
athena/modules/text/module_text_llm/Dockerfile (1)

37-44: Consider dropping root privileges in the runtime stage.

No USER is set, so the container runs as root by default. Adding a non-root user would further reduce runtime risk.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@athena/modules/text/module_text_llm/Dockerfile` around lines 37 - 44, The
Dockerfile currently leaves the runtime stage running as root; add a non-root
user and switch to it in the runtime stage: create a dedicated user/group (e.g.,
"app" or "athena"), ensure ownership of /code and the copied .venv is changed to
that user (chown /code and /code/.venv), and add a USER instruction near the end
to run as that non-root user; keep WORKDIR /code and ENV
PATH="/code/.venv/bin:$PATH" but ensure the PATH entry is accessible to the
non-root user.
athena/modules/programming/module_programming_apted/Dockerfile (2)

38-43: Make venv copy last to prevent accidental overwrite from context.

Safer ordering is to copy project files first, then copy /code/.venv from builder.

🛡️ Proposed change
 WORKDIR /code

-# Copy only the virtual environment from the builder
-COPY --from=builder /code/.venv /code/.venv
-ENV PATH="/code/.venv/bin:$PATH"
-
 # Project files
 COPY . ./
+
+# Copy only the virtual environment from the builder
+COPY --from=builder /code/.venv /code/.venv
+ENV PATH="/code/.venv/bin:$PATH"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@athena/modules/programming/module_programming_apted/Dockerfile` around lines
38 - 43, The Dockerfile currently copies the built virtualenv before the project
context, which risks the project COPY overwriting parts of /code/.venv; change
the order so the project files are copied first (the COPY . ./ line) and then
copy the built venv from the builder using the COPY --from=builder /code/.venv
/code/.venv line (leave ENV PATH="/code/.venv/bin:$PATH" as-is) so the builder
venv always wins and cannot be accidentally overwritten by the build context.

21-24: Prefer explicit runtime-only Poetry install scope.

Using explicit flags avoids installing the root package and dev dependencies, making the intent clear for production images. The project has a 'dev' dependency group (not optional), so the current command installs unnecessary development packages in production.

♻️ Proposed change
 RUN poetry config virtualenvs.create true \
     && poetry config virtualenvs.in-project true \
-    && poetry install --no-interaction --no-ansi
+    && poetry install --only main --no-root --no-interaction --no-ansi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@athena/modules/programming/module_programming_apted/Dockerfile` around lines
21 - 24, The poetry install in the Dockerfile currently installs dev deps and
the root package; update the RUN step that invokes "poetry install" to perform a
runtime-only installation by using Poetry's explicit scope flags (e.g., replace
the current "poetry install --no-interaction --no-ansi" with a runtime-only
invocation such as "poetry install --no-interaction --no-ansi --only main" for
Poetry >=1.2, or "poetry install --no-interaction --no-ansi --no-dev --no-root"
for older Poetry versions) so the image does not install dev dependencies or the
project package at build time.
iris/Dockerfile.dockerignore (1)

1-24: Add explicit secret/environment ignore patterns.

This list is strong, but Line 1-24 should also explicitly ignore env/secret files to reduce accidental context leakage.

🔒 Suggested hardening
 .mypy_cache/
+.env
+.env.*
+*.pem
+*.key
+secrets/
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/Dockerfile.dockerignore` around lines 1 - 24, The dockerignore currently
excludes many project dirs but lacks explicit patterns for environment and
secret files; update the iris Dockerfile.dockerignore to add explicit ignores
such as .env, *.env, .env.local, .env.* , *.secret, secrets/, secret*/,
secrets/**/*.key, credentials/, credentials/**/*.json, .aws/,
.docker/config.json, .gnupg/, and any local config files (e.g., .envrc) so
environment/secret artifacts are never sent in build context; place these new
patterns alongside the existing excludes in the same file (look for the existing
list of top-level ignores shown in the diff) and ensure patterns cover both
files and directories as given.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@athena/modules/modeling/module_modeling_llm/Dockerfile`:
- Around line 23-25: Poetry in the builder stage is installing dev/test groups
and the in-project .venv is being copied into the runtime image; change the
install invocation in the Dockerfile (the RUN that currently calls "poetry
install --no-interaction --no-ansi") to limit to runtime deps by adding "--only
main --no-root" so only main dependencies are installed into the builder venv
before copying to the runtime stage.

In `@athena/modules/programming/module_example/Dockerfile`:
- Around line 19-24: The runtime image is missing the athena source that the
editable install (develop = true in pyproject.toml) points to; add a step in the
runtime stage to copy the built athena source from the builder image into the
runtime image (i.e., run a COPY --from=builder /athena /athena in the Dockerfile
runtime stage) so the in-project .venv editable installation can resolve /athena
at container runtime.

In `@athena/modules/programming/module_programming_llm/Dockerfile`:
- Around line 19-25: The Dockerfile currently installs path dependencies as
editable (develop = true) which creates symlinks in the in-project .venv that
point to /athena and /llm_core present only in the builder stage; at runtime
those symlinks will be broken—fix by either updating the path dependencies in
pyproject.toml to use develop = false so poetry bakes the packages into the
.venv, or modify this Dockerfile to COPY --from=athena /code /athena and COPY
--from=llm_core /code /llm_core into the final runtime stage (ensure the same
/athena and /llm_core directories exist alongside the .venv), and keep
references to the Dockerfile COPY lines and pyproject.toml path dependency
entries when making the change.

In `@athena/modules/programming/module_programming_themisml/Dockerfile`:
- Around line 40-46: The runtime Dockerfile stage currently copies only the
built virtualenv (.venv) and project files, but editable installs created in the
builder produce symlinks inside .venv that point to /athena; to avoid
ModuleNotFoundError add a step to copy the athena source tree from the builder
into the runtime image (i.e., add COPY --from=builder /athena /athena) so the
symlink targets exist at runtime; place this COPY near the existing COPY
--from=builder /code/.venv /code/.venv (before or after) so PATH and CMD
["python", "-m", "module_programming_themisml"] continue to work.

In `@athena/modules/text/module_text_cofee/Dockerfile`:
- Around line 29-30: The pyproject.toml declares athena = { path =
"../../../athena", develop = true } which causes an editable install that
creates symlinks to /athena only present in the builder stage; fix by either
removing develop = true so the package is installed as a non-editable wheel, or
add a runtime-stage copy of the built /athena from the builder (e.g., add COPY
--from=builder /athena /athena in the Dockerfile runtime stage) so the symlink
targets exist at runtime; update pyproject.toml or the Dockerfile accordingly
and rebuild.

In `@athena/modules/text/module_text_llm/Dockerfile`:
- Around line 23-25: The Dockerfile's poetry install command in the build stage
(the RUN line that calls "poetry install --no-interaction --no-ansi") installs
dev/test groups and leaves editable path deps broken in the runtime .venv;
change that invocation to include the flags "--only main --no-root" so Poetry
installs only the main group and avoids creating a project editable install,
ensuring the runtime environment does not contain missing editable path
references (adjust the RUN line that configures poetry virtualenvs and runs
poetry install).

In `@iris/Dockerfile`:
- Around line 18-27: Replace the non-reproducible "uv pip install ." invocation
so the build respects the copied poetry.lock: either export Poetry's locked deps
and sync them (use "poetry export" to produce a requirements lockfile and then
run "uv pip sync" against that lockfile) or migrate to uv's lockfile workflow by
generating and committing an "uv.lock" and using "uv sync --locked"; update the
Dockerfile step that currently runs "uv pip install ." to one of these
locked-install approaches and ensure the --mount cache for /root/.cache/uv
remains in place.

---

Outside diff comments:
In `@iris/Dockerfile`:
- Around line 30-48: The Dockerfile runtime stage runs as root; create a
non-root user and switch to it: add steps to create a user/group (e.g.,
appuser), chown the WORKDIR and the copied VIRTUAL_ENV (/app and /app/.venv) to
that user, and add a USER directive before the CMD so uvicorn (iris.main:app)
runs unprivileged; reference the existing WORKDIR /app, COPY --from=builder
/app/.venv, ENV VIRTUAL_ENV, and CMD ["uvicorn", "iris.main:app", ...] when
applying the changes.

---

Nitpick comments:
In `@athena/modules/programming/module_programming_apted/Dockerfile`:
- Around line 38-43: The Dockerfile currently copies the built virtualenv before
the project context, which risks the project COPY overwriting parts of
/code/.venv; change the order so the project files are copied first (the COPY .
./ line) and then copy the built venv from the builder using the COPY
--from=builder /code/.venv /code/.venv line (leave ENV
PATH="/code/.venv/bin:$PATH" as-is) so the builder venv always wins and cannot
be accidentally overwritten by the build context.
- Around line 21-24: The poetry install in the Dockerfile currently installs dev
deps and the root package; update the RUN step that invokes "poetry install" to
perform a runtime-only installation by using Poetry's explicit scope flags
(e.g., replace the current "poetry install --no-interaction --no-ansi" with a
runtime-only invocation such as "poetry install --no-interaction --no-ansi
--only main" for Poetry >=1.2, or "poetry install --no-interaction --no-ansi
--no-dev --no-root" for older Poetry versions) so the image does not install dev
dependencies or the project package at build time.

In `@athena/modules/programming/module_programming_themisml/Dockerfile`:
- Around line 9-11: In the Dockerfile RUN line that installs packages (the
command containing "apt-get update && apt-get install -y --no-install-recommends
\ build-essential gcc libpq-dev \ && rm -rf /var/lib/apt/lists/*"), remove the
redundant "gcc" package from the install list because it is provided by
"build-essential"; leave "build-essential" and "libpq-dev" and keep the existing
cleanup (rm -rf) unchanged.

In `@athena/modules/programming/module_programming_winnowing/Dockerfile`:
- Around line 22-24: Update the Dockerfile RUN step that calls poetry install
(the RUN line containing "poetry install --no-interaction --no-ansi") to exclude
development dependencies by adding the appropriate flag (e.g., --only main) so
the created virtualenv contains only production deps; edit the RUN command that
configures poetry and runs "poetry install" to append the flag and rebuild the
image to reduce runtime image size.

In `@athena/modules/text/module_text_cofee/Dockerfile`:
- Around line 31-34: The RUN line that runs Poetry should install only
dependencies in the build stage; update the command that currently calls "poetry
install --no-interaction --no-ansi" to include the --no-root flag so Poetry does
not try to install the project itself (i.e., change the RUN block containing the
poetry config lines to run "poetry install --no-root --no-interaction
--no-ansi").
- Around line 48-55: The Dockerfile currently copies the full project (COPY .
./) after copying the generated protobufs, which allows any cofee_pb2.py
accidentally present in the build context to overwrite the generated files; to
fix, swap the COPY order so project files are copied first (COPY . ./) and then
copy the generated protobuf directory from the builder stage (COPY
--from=builder /code/module_text_cofee/protobuf/
/code/module_text_cofee/protobuf/) ensuring the generated protobufs (e.g.,
cofee_pb2.py) in the builder take precedence over files in the build context.

In `@athena/modules/text/module_text_llm/Dockerfile`:
- Around line 37-44: The Dockerfile currently leaves the runtime stage running
as root; add a non-root user and switch to it in the runtime stage: create a
dedicated user/group (e.g., "app" or "athena"), ensure ownership of /code and
the copied .venv is changed to that user (chown /code and /code/.venv), and add
a USER instruction near the end to run as that non-root user; keep WORKDIR /code
and ENV PATH="/code/.venv/bin:$PATH" but ensure the PATH entry is accessible to
the non-root user.

In `@iris/Dockerfile.dockerignore`:
- Around line 1-24: The dockerignore currently excludes many project dirs but
lacks explicit patterns for environment and secret files; update the iris
Dockerfile.dockerignore to add explicit ignores such as .env, *.env, .env.local,
.env.* , *.secret, secrets/, secret*/, secrets/**/*.key, credentials/,
credentials/**/*.json, .aws/, .docker/config.json, .gnupg/, and any local config
files (e.g., .envrc) so environment/secret artifacts are never sent in build
context; place these new patterns alongside the existing excludes in the same
file (look for the existing list of top-level ignores shown in the diff) and
ensure patterns cover both files and directories as given.

In `@nebula/docker/faq/Dockerfile`:
- Around line 1-35: The Dockerfile currently runs the image as root; add a
non-root user and ensure /app is owned by that user before switching to it to
harden runtime. In the Dockerfile (builder/runtime stages), create a dedicated
user/group (e.g., appuser/appgroup), chown the application files and the copied
virtualenv at /app to that uid/gid, and add a USER instruction so the final
runtime executes uvicorn as that non-root user instead of root.

In `@nebula/docker/transcriber/Dockerfile`:
- Around line 1-46: Add a non-root runtime user to the final image to improve
container security: create a user/group (e.g., "nebula"), chown /app and any
runtime dirs to that user after copying the venv and source (refer to the COPY
--from=builder /app/.venv /app/.venv, COPY nebula/src/nebula ./nebula and
WORKDIR /app lines), then switch to that user with USER before the EXPOSE/CMD
lines so uvicorn (nebula.transcript.app:app) runs unprivileged; ensure PATH and
PYTHONPATH remain correct for the in-project venv.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c8d4052 and 18e3a90.

📒 Files selected for processing (30)
  • athena/assessment_module_manager/.dockerignore
  • athena/assessment_module_manager/Dockerfile
  • athena/athena/.dockerignore
  • athena/athena/Dockerfile
  • athena/llm_core/.dockerignore
  • athena/llm_core/Dockerfile
  • athena/log_viewer/.dockerignore
  • athena/log_viewer/Dockerfile
  • athena/modules/modeling/module_modeling_llm/.dockerignore
  • athena/modules/modeling/module_modeling_llm/Dockerfile
  • athena/modules/programming/module_example/.dockerignore
  • athena/modules/programming/module_example/Dockerfile
  • athena/modules/programming/module_programming_apted/.dockerignore
  • athena/modules/programming/module_programming_apted/Dockerfile
  • athena/modules/programming/module_programming_llm/.dockerignore
  • athena/modules/programming/module_programming_llm/Dockerfile
  • athena/modules/programming/module_programming_themisml/.dockerignore
  • athena/modules/programming/module_programming_themisml/Dockerfile
  • athena/modules/programming/module_programming_winnowing/.dockerignore
  • athena/modules/programming/module_programming_winnowing/Dockerfile
  • athena/modules/text/module_text_cofee/.dockerignore
  • athena/modules/text/module_text_cofee/Dockerfile
  • athena/modules/text/module_text_llm/.dockerignore
  • athena/modules/text/module_text_llm/Dockerfile
  • iris/Dockerfile
  • iris/Dockerfile.dockerignore
  • nebula/docker/faq/Dockerfile
  • nebula/docker/faq/Dockerfile.dockerignore
  • nebula/docker/transcriber/Dockerfile
  • nebula/docker/transcriber/Dockerfile.dockerignore

Comment on lines 23 to 25
RUN poetry config virtualenvs.create true \
&& poetry config virtualenvs.in-project true \
&& poetry install --no-interaction --no-ansi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

dockerfile="athena/modules/modeling/module_modeling_llm/Dockerfile"
pyproject="athena/modules/modeling/module_modeling_llm/pyproject.toml"

echo "== Poetry install command in Dockerfile =="
rg -n 'poetry install' "$dockerfile"

echo
echo "== Dependency groups declared in pyproject.toml =="
rg -n '^\[tool\.poetry\.dev-dependencies\]|^\[tool\.poetry\.group\..*\.dependencies\]' "$pyproject" || true

echo
echo "== Group optionality metadata =="
rg -n '^\[tool\.poetry\.group\..*\]$|^optional\s*=' "$pyproject" || true

Repository: ls1intum/edutelligence

Length of output: 408


🏁 Script executed:

cat -n athena/modules/modeling/module_modeling_llm/Dockerfile | head -50

Repository: ls1intum/edutelligence

Length of output: 1711


Limit Poetry install to runtime deps only in builder venv.

At line 25, poetry install pulls dev and test dependency groups (confirmed in pyproject.toml) along with main, and the entire .venv is copied to the runtime stage at line 40. For this optimization PR, apply --only main --no-root to exclude non-runtime dependencies and keep the runtime image minimal.

Proposed diff
 RUN poetry config virtualenvs.create true \
     && poetry config virtualenvs.in-project true \
-    && poetry install --no-interaction --no-ansi
+    && poetry install --only main --no-root --no-interaction --no-ansi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN poetry config virtualenvs.create true \
&& poetry config virtualenvs.in-project true \
&& poetry install --no-interaction --no-ansi
RUN poetry config virtualenvs.create true \
&& poetry config virtualenvs.in-project true \
&& poetry install --only main --no-root --no-interaction --no-ansi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@athena/modules/modeling/module_modeling_llm/Dockerfile` around lines 23 - 25,
Poetry in the builder stage is installing dev/test groups and the in-project
.venv is being copied into the runtime image; change the install invocation in
the Dockerfile (the RUN that currently calls "poetry install --no-interaction
--no-ansi") to limit to runtime deps by adding "--only main --no-root" so only
main dependencies are installed into the builder venv before copying to the
runtime stage.

Comment on lines 19 to 24
# athena module (from the Dockerfile in the athena folder)
COPY --from=athena /code /athena
# install dependencies
# install dependencies into an in-project venv
RUN poetry config virtualenvs.create true \
&& poetry config virtualenvs.in-project true \
&& poetry install --no-interaction --no-ansi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

find . -path "*/module_example/pyproject.toml" -type f -exec cat {} \;

Repository: ls1intum/edutelligence

Length of output: 899


Copy /athena source to the runtime stage.

The pyproject.toml installs athena with develop = true, which creates an editable installation that references the source files at /athena. The builder stage copies /athena from the athena image and uses it to install dependencies, but the runtime stage only copies the .venv and omits the athena source code. At runtime, the editable installation will fail because it points to a non-existent /athena directory.

Add COPY --from=builder /athena /athena before or after line 39 in the runtime stage to ensure the editable dependency can be resolved.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@athena/modules/programming/module_example/Dockerfile` around lines 19 - 24,
The runtime image is missing the athena source that the editable install
(develop = true in pyproject.toml) points to; add a step in the runtime stage to
copy the built athena source from the builder image into the runtime image
(i.e., run a COPY --from=builder /athena /athena in the Dockerfile runtime
stage) so the in-project .venv editable installation can resolve /athena at
container runtime.

Comment on lines 19 to 25
# athena module (from the Dockerfile in the athena folder)
COPY --from=athena /code /athena
COPY --from=llm_core /code /llm_core
# install dependencies
# install dependencies into an in-project venv
RUN poetry config virtualenvs.create true \
&& poetry config virtualenvs.in-project true \
&& poetry install --no-interaction --no-ansi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find and read the pyproject.toml for this module
fd -t f 'pyproject.toml' athena/modules/programming/module_programming_llm

# Then read its contents
if [ -f "athena/modules/programming/module_programming_llm/pyproject.toml" ]; then
  echo "=== pyproject.toml contents ==="
  cat "athena/modules/programming/module_programming_llm/pyproject.toml"
fi

Repository: ls1intum/edutelligence

Length of output: 1219


🏁 Script executed:

#!/bin/bash
# Check for other Dockerfiles using similar multi-stage build patterns with COPY --from
fd 'Dockerfile' athena/modules --type f | while read -r dockerfile; do
  if grep -q 'COPY --from=' "$dockerfile" 2>/dev/null; then
    echo "=== File: $dockerfile ==="
    head -30 "$dockerfile"
    echo ""
  fi
done

Repository: ls1intum/edutelligence

Length of output: 7957


🏁 Script executed:

#!/bin/bash
# Search for editable path dependencies pattern
rg -A 5 -B 5 'path.*=.*{.*path' athena/modules/programming/module_programming_llm 2>/dev/null || echo "No editable syntax found with = { path"

Repository: ls1intum/edutelligence

Length of output: 104


🏁 Script executed:

#!/bin/bash
# Read the complete Dockerfile for module_programming_llm
cat "athena/modules/programming/module_programming_llm/Dockerfile"

Repository: ls1intum/edutelligence

Length of output: 1381


Remove develop = true from path dependencies or copy source directories to the runtime stage.

The pyproject.toml configures path dependencies with develop = true, which creates editable installs via symlinks in the .venv. The builder stage copies /athena and /llm_core, but the runtime stage only copies the .venv and project files—not the source directories. At runtime, the symlinks will be broken, causing import failures when the module attempts to use these packages.

Either:

  1. Change path dependencies to develop = false in pyproject.toml (installs packages into .venv directly), or
  2. Copy /athena and /llm_core to the runtime stage as well.
Current Dockerfile snippet (lines 19-25)
# athena module (from the Dockerfile in the athena folder)
COPY --from=athena /code /athena
COPY --from=llm_core /code /llm_core
# install dependencies into an in-project venv
RUN poetry config virtualenvs.create true \
    && poetry config virtualenvs.in-project true \
    && poetry install --no-interaction --no-ansi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@athena/modules/programming/module_programming_llm/Dockerfile` around lines 19
- 25, The Dockerfile currently installs path dependencies as editable (develop =
true) which creates symlinks in the in-project .venv that point to /athena and
/llm_core present only in the builder stage; at runtime those symlinks will be
broken—fix by either updating the path dependencies in pyproject.toml to use
develop = false so poetry bakes the packages into the .venv, or modify this
Dockerfile to COPY --from=athena /code /athena and COPY --from=llm_core /code
/llm_core into the final runtime stage (ensure the same /athena and /llm_core
directories exist alongside the .venv), and keep references to the Dockerfile
COPY lines and pyproject.toml path dependency entries when making the change.

Comment on lines +40 to +46
COPY --from=builder /code/.venv /code/.venv
ENV PATH="/code/.venv/bin:$PATH"

# Project files
COPY . ./

# poetry scripts don't work here
CMD poetry run python -m module_* No newline at end of file
CMD ["python", "-m", "module_programming_themisml"] No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check if module_programming_themisml imports athena at runtime
# and whether athena is listed as a path dependency in pyproject.toml

echo "=== Checking pyproject.toml for athena dependency ==="
cat athena/modules/programming/module_programming_themisml/pyproject.toml 2>/dev/null | grep -A5 -i athena || echo "pyproject.toml not found or no athena reference"

echo ""
echo "=== Checking Python imports of athena in module source ==="
rg -n "from athena|import athena" athena/modules/programming/module_programming_themisml/ --type py || echo "No athena imports found"

Repository: ls1intum/edutelligence

Length of output: 3364


The runtime stage must also copy the /athena directory from the builder.

The module declares athena as a path dependency with develop = true (editable install) in pyproject.toml. Multiple source files import from athena at runtime (__main__.py, feedback_suggestions.py, etc.). When the venv is built in the builder stage with an editable install, it creates symlinks pointing to /athena. Copying only the .venv to the runtime stage breaks these symlinks, causing ModuleNotFoundError when the module starts.

Add COPY --from=builder /athena /athena before or after line 40 to copy the athena package to the runtime stage.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@athena/modules/programming/module_programming_themisml/Dockerfile` around
lines 40 - 46, The runtime Dockerfile stage currently copies only the built
virtualenv (.venv) and project files, but editable installs created in the
builder produce symlinks inside .venv that point to /athena; to avoid
ModuleNotFoundError add a step to copy the athena source tree from the builder
into the runtime image (i.e., add COPY --from=builder /athena /athena) so the
symlink targets exist at runtime; place this COPY near the existing COPY
--from=builder /code/.venv /code/.venv (before or after) so PATH and CMD
["python", "-m", "module_programming_themisml"] continue to work.

Comment on lines 29 to 30
# athena module (from the Dockerfile in the athena folder)
COPY --from=athena /code /athena
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, find the Dockerfile and examine its full content
cd athena/modules/text/module_text_cofee
if [ -f "Dockerfile" ]; then
  echo "=== Full Dockerfile ===" 
  cat -n Dockerfile
else
  echo "Dockerfile not found in expected location"
fi

Repository: ls1intum/edutelligence

Length of output: 2167


🏁 Script executed:

#!/bin/bash
# Search for pyproject.toml in the athena module directory
find athena/modules/text/module_text_cofee -name 'pyproject.toml' -type f

Repository: ls1intum/edutelligence

Length of output: 118


🏁 Script executed:

#!/bin/bash
# Check for pyproject.toml files in parent directories as well
find athena -name 'pyproject.toml' -type f | head -20

Repository: ls1intum/edutelligence

Length of output: 782


🏁 Script executed:

#!/bin/bash
# Look for poetry.lock files
find athena -name 'poetry.lock' -type f | head -20

Repository: ls1intum/edutelligence

Length of output: 737


🏁 Script executed:

#!/bin/bash
cat -n athena/modules/text/module_text_cofee/pyproject.toml

Repository: ls1intum/edutelligence

Length of output: 1247


🏁 Script executed:

#!/bin/bash
# Check for athena dependency in poetry.lock (first 100 lines or search for athena)
grep -n 'athena\|develop = true\|path = ' athena/modules/text/module_text_cofee/poetry.lock | head -30

Repository: ls1intum/edutelligence

Length of output: 129


The athena dependency is installed in editable mode and will cause runtime failures.

Line 11 of pyproject.toml declares: athena = { path = "../../../athena", develop = true }. This editable installation creates symlinks in the .venv pointing to /athena, which exists only in the builder stage. The runtime stage copies only .venv and project files (lines 49, 55) but not the /athena directory, causing broken references at runtime.

Resolve this by either:

  1. Removing develop = true to install as a non-editable wheel, or
  2. Adding COPY --from=builder /athena /athena in the runtime stage.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@athena/modules/text/module_text_cofee/Dockerfile` around lines 29 - 30, The
pyproject.toml declares athena = { path = "../../../athena", develop = true }
which causes an editable install that creates symlinks to /athena only present
in the builder stage; fix by either removing develop = true so the package is
installed as a non-editable wheel, or add a runtime-stage copy of the built
/athena from the builder (e.g., add COPY --from=builder /athena /athena in the
Dockerfile runtime stage) so the symlink targets exist at runtime; update
pyproject.toml or the Dockerfile accordingly and rebuild.

Comment on lines 23 to 25
RUN poetry config virtualenvs.create true \
&& poetry config virtualenvs.in-project true \
&& poetry install --no-interaction --no-ansi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

dockerfile="athena/modules/text/module_text_llm/Dockerfile"
pyproject="athena/modules/text/module_text_llm/pyproject.toml"
lockfile="athena/modules/text/module_text_llm/poetry.lock"

echo "== Poetry install command =="
rg -n 'poetry install' "$dockerfile"

echo
echo "== Declared dependency groups (look for dev/test/etc.) =="
rg -n '^\[tool\.poetry\.(dev-dependencies|group\.[^.]+\.dependencies)\]' "$pyproject" || echo "(no extra groups found)"

echo
echo "== Path deps + editable flags in pyproject =="
rg -n 'path\s*=|develop\s*=' "$pyproject" || echo "(no path deps or editable flags found)"

echo
echo "== Directory/editable deps in lockfile =="
rg -n 'type = "directory"|develop = true|url = "/athena"|url = "/llm_core"' "$lockfile" || echo "(no directory/editable deps found)"

Repository: ls1intum/edutelligence

Length of output: 631


🏁 Script executed:

cat -n athena/modules/text/module_text_llm/Dockerfile | sed -n '20,50p'

Repository: ls1intum/edutelligence

Length of output: 1056


Add --only main --no-root flags to poetry install command.

Line 25 runs poetry install without constraints, which installs dev/test groups by default. The pyproject.toml declares editable path dependencies (athena and llm_core with develop = true) that the build stage copies but the runtime stage (line 40) does not. This leaves the runtime .venv with broken references.

Required Dockerfile fix
 RUN poetry config virtualenvs.create true \
     && poetry config virtualenvs.in-project true \
-    && poetry install --no-interaction --no-ansi
+    && poetry install --only main --no-root --no-interaction --no-ansi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN poetry config virtualenvs.create true \
&& poetry config virtualenvs.in-project true \
&& poetry install --no-interaction --no-ansi
RUN poetry config virtualenvs.create true \
&& poetry config virtualenvs.in-project true \
&& poetry install --only main --no-root --no-interaction --no-ansi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@athena/modules/text/module_text_llm/Dockerfile` around lines 23 - 25, The
Dockerfile's poetry install command in the build stage (the RUN line that calls
"poetry install --no-interaction --no-ansi") installs dev/test groups and leaves
editable path deps broken in the runtime .venv; change that invocation to
include the flags "--only main --no-root" so Poetry installs only the main group
and avoids creating a project editable install, ensuring the runtime environment
does not contain missing editable path references (adjust the RUN line that
configures poetry virtualenvs and runs poetry install).

Comment on lines 18 to +27
COPY iris/pyproject.toml iris/poetry.lock ./

# Copy memiris (path dependency required for resolution)
COPY memiris ../memiris

# Install poetry
RUN pip install poetry
# Create a stub package so uv can resolve the project without full source
RUN mkdir -p src/iris && echo '__version__ = "3.0.0"' > src/iris/__init__.py && touch README.MD

# Install all main dependencies (cached unless pyproject.toml or memiris changes)
RUN --mount=type=cache,target=/root/.cache/uv uv pip install .
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd iris && cat -n Dockerfile

Repository: ls1intum/edutelligence

Length of output: 2004


🌐 Web query:

uv pip install poetry.lock dependency resolution

💡 Result:

uv pip install does not use poetry.lock as an input lockfile for dependency resolution. In uv’s “pip interface”, uv pip install -r … supports requirements.txt and pyproject.toml inputs (plus other pip-style specifiers), but poetry.lock isn’t a supported -r file format. [1]

So you have three practical options:

  1. Use Poetry to get the exact poetry.lock resolution
    Use Poetry’s install/sync workflow (e.g., poetry sync) if the lockfile is the source of truth. (Example of platforms treating poetry.lock as “use Poetry”.) [2]

  2. Migrate to uv’s lockfile and then install deterministically with uv
    In a Poetry project, uv can read the Poetry-style pyproject.toml; run:

    • uv sync → creates/updates uv.lock (uv’s lockfile) [3]
      Then in CI/production:
    • uv sync --locked → install exactly what’s in uv.lock (no re-resolution drift). [2]
  3. If you insist on uv pip …, accept that it will resolve (not “replay” poetry.lock)
    uv’s resolver will compute a solution based on the requirements you give it, and it will only “prefer” already-locked versions if you’re using a uv lockfile / requirements output that uv understands. [4]

Sources:
[1] uv docs: installing packages from requirements.txt / pyproject.toml via uv pip install -r https://docs.astral.sh/uv/pip/packages/
[2] Heroku Dev Center: behavior with poetry.lock (Poetry) vs uv.lock (uv, uv sync --locked) https://devcenter2.assets.heroku.com/articles/python-behavior-in-heroku
[3] uv docs: uv project workflow overview (projects, uv sync, lockfile usage) https://docs.astral.sh/uv/
[4] uv docs: resolution, lockfile preference behavior https://docs.astral.sh/uv/concepts/resolution/


🌐 Web query:

uv tool how to enforce locked dependencies from poetry.lock

💡 Result:

uv can’t “sync from” poetry.lock directly (its project workflow enforces uv.lock via uv sync). To enforce Poetry’s locked versions with uv, use one of these approaches:

Option A (keep poetry.lock as source of truth): export → uv pip sync

  1. Export a pinned requirements file from poetry.lock:
poetry export -f requirements.txt -o requirements.lock.txt
# optionally: --without-hashes

(That export command is provided by Poetry’s export feature / plugin.) [1]

  1. Enforce it in the environment with uv (installs/removes to match the file):
uv pip sync requirements.lock.txt

(uv pip sync is uv’s “make the env match the lock/requirements file” command.) [2]

Option B (recommended long-term): migrate to uv.lock and enforce with --locked

  1. Generate / maintain uv.lock, then enforce it in CI/dev:
uv sync --locked

--locked makes uv fail if uv.lock isn’t up to date with pyproject.toml (instead of silently re-locking). [3]

Related: uv export can export from uv.lock (e.g., to requirements.txt) when you need pip-compatible output. [4]


Sources
[1] Poetry export (poetry-plugin-export) https://github.com/python-poetry/poetry-plugin-export (github.com)
[2] uv pip sync example (uv docs) https://docs.astral.sh/uv/ (docs.astral.sh)
[3] uv locking/syncing, --locked semantics https://docs.astral.sh/uv/concepts/projects/sync/ (docs.astral.sh)
[4] uv export (from uv.lock) https://docs.astral.sh/uv/concepts/projects/export/ (docs.astral.sh)

Citations:


Replace uv pip install . with a locked dependency installation.

Line 27 uses uv pip install ., which ignores the poetry.lock file copied at line 18. The uv pip interface does not consume poetry.lock; it only reads pyproject.toml and requirements.txt. This bypasses the lockfile entirely and makes builds non-reproducible.

Choose one approach:

  1. Export and sync (keep Poetry's lockfile as source of truth):

    RUN poetry export -f requirements.txt -o requirements.lock.txt
    RUN --mount=type=cache,target=/root/.cache/uv uv pip sync requirements.lock.txt
  2. Migrate to uv's lockfile (recommended for uv-based workflow):

    RUN --mount=type=cache,target=/root/.cache/uv uv sync --locked

    (First generate uv.lock locally via uv sync, then commit it to the repo.)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iris/Dockerfile` around lines 18 - 27, Replace the non-reproducible "uv pip
install ." invocation so the build respects the copied poetry.lock: either
export Poetry's locked deps and sync them (use "poetry export" to produce a
requirements lockfile and then run "uv pip sync" against that lockfile) or
migrate to uv's lockfile workflow by generating and committing an "uv.lock" and
using "uv sync --locked"; update the Dockerfile step that currently runs "uv pip
install ." to one of these locked-install approaches and ensure the --mount
cache for /root/.cache/uv remains in place.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 3, 2026

Athena Test Results Summary

TestsPassed ✅SkippedFailed
Athena Test Report10 ran10 passed0 skipped0 failed

Failing Tests Summary

TestResult
No test annotations available

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 3, 2026

📊 Detailed Coverage Table

Combining 3 coverage files...
Parsing test-results/programming_module_programming_llm_coverage.xml...
Parsing test-results/text_module_text_llm_coverage.xml...
Parsing test-results/modeling_module_modeling_llm_coverage.xml...
Combining duplicate packages...
Creating combined coverage file: test-results/combined_coverage.xml
✅ Combined coverage saved to test-results/combined_coverage.xml
📊 Combined 31 unique packages

📊 Combined Coverage Summary:

Package Line Rate Branch Rate Status
athena 37.8% 3.3%
athena.helpers 100.0% 100.0%
athena.helpers.programming 33.0% 0.0%
athena.helpers.text 0.0% 100.0%
athena.models 0.0% 0.0%
athena.schemas 76.5% 8.3%
athena.storage 21.1% 0.0%
llm_core 100.0% 100.0%
llm_core.core 26.0% 6.2%
llm_core.loaders 79.3% 37.5%
llm_core.loaders.model_loaders 68.5% 37.5%
llm_core.models 66.7% 35.7%
llm_core.models.providers 77.2% 56.2%
llm_core.utils 52.8% 18.5%
modeling.module_modeling_llm.module_modeling_llm 100.0% 100.0%
modeling.module_modeling_llm.module_modeling_llm.apollon_transformer 71.4% 50.0%
modeling.module_modeling_llm.module_modeling_llm.apollon_transformer.parser 79.2% 60.2%
modeling.module_modeling_llm.module_modeling_llm.core 88.9% 50.0%
modeling.module_modeling_llm.module_modeling_llm.models 100.0% 100.0%
modeling.module_modeling_llm.module_modeling_llm.prompts 100.0% 100.0%
modeling.module_modeling_llm.module_modeling_llm.utils 100.0% 50.0%
programming.module_programming_llm.module_programming_llm 100.0% 100.0%
programming.module_programming_llm.module_programming_llm.helpers 27.6% 0.0%
programming.module_programming_llm.module_programming_llm.prompts 100.0% 100.0%
text.module_text_llm.module_text_llm 72.7% 12.5%
text.module_text_llm.module_text_llm.default_approach 66.4% 36.1%
text.module_text_llm.module_text_llm.default_approach.prompts 100.0% 100.0%
text.module_text_llm.module_text_llm.default_approach.schemas 100.0% 100.0%
text.module_text_llm.module_text_llm.divide_and_conquer 34.0% 0.0%
text.module_text_llm.module_text_llm.helpers 55.4% 26.7%
text.module_text_llm.module_text_llm.self_consistency 46.2% 0.0%

Total packages: 31

Note: Coverage thresholds: ✅ (≥70%), ❌ (<70%)

@wasnertobias wasnertobias changed the title chore: Optimize Docker build times and image sizes Development: Optimize Docker build times and image sizes Mar 3, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 3, 2026

Athena Test Results Summary

TestsPassed ✅SkippedFailed
Athena Test Report10 ran10 passed0 skipped0 failed

Failing Tests Summary

TestResult
No test annotations available

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 3, 2026

📊 Detailed Coverage Table

Combining 3 coverage files...
Parsing test-results/programming_module_programming_llm_coverage.xml...
Parsing test-results/text_module_text_llm_coverage.xml...
Parsing test-results/modeling_module_modeling_llm_coverage.xml...
Combining duplicate packages...
Creating combined coverage file: test-results/combined_coverage.xml
✅ Combined coverage saved to test-results/combined_coverage.xml
📊 Combined 31 unique packages

📊 Combined Coverage Summary:

Package Line Rate Branch Rate Status
athena 37.8% 3.3%
athena.helpers 100.0% 100.0%
athena.helpers.programming 33.0% 0.0%
athena.helpers.text 0.0% 100.0%
athena.models 0.0% 0.0%
athena.schemas 76.5% 8.3%
athena.storage 21.1% 0.0%
llm_core 100.0% 100.0%
llm_core.core 26.0% 6.2%
llm_core.loaders 79.3% 37.5%
llm_core.loaders.model_loaders 68.5% 37.5%
llm_core.models 66.7% 35.7%
llm_core.models.providers 77.2% 56.2%
llm_core.utils 52.8% 18.5%
modeling.module_modeling_llm.module_modeling_llm 100.0% 100.0%
modeling.module_modeling_llm.module_modeling_llm.apollon_transformer 71.4% 50.0%
modeling.module_modeling_llm.module_modeling_llm.apollon_transformer.parser 79.2% 60.2%
modeling.module_modeling_llm.module_modeling_llm.core 88.9% 50.0%
modeling.module_modeling_llm.module_modeling_llm.models 100.0% 100.0%
modeling.module_modeling_llm.module_modeling_llm.prompts 100.0% 100.0%
modeling.module_modeling_llm.module_modeling_llm.utils 100.0% 50.0%
programming.module_programming_llm.module_programming_llm 100.0% 100.0%
programming.module_programming_llm.module_programming_llm.helpers 27.6% 0.0%
programming.module_programming_llm.module_programming_llm.prompts 100.0% 100.0%
text.module_text_llm.module_text_llm 72.7% 12.5%
text.module_text_llm.module_text_llm.default_approach 66.4% 36.1%
text.module_text_llm.module_text_llm.default_approach.prompts 100.0% 100.0%
text.module_text_llm.module_text_llm.default_approach.schemas 100.0% 100.0%
text.module_text_llm.module_text_llm.divide_and_conquer 34.0% 0.0%
text.module_text_llm.module_text_llm.helpers 55.4% 26.7%
text.module_text_llm.module_text_llm.self_consistency 46.2% 0.0%

Total packages: 31

Note: Coverage thresholds: ✅ (≥70%), ❌ (<70%)

@github-actions
Copy link
Copy Markdown

There hasn't been any activity on this pull request recently. Therefore, this pull request has been automatically marked as stale and will be closed if no further activity occurs within seven days. Thank you for your contributions.

@github-actions github-actions bot added the stale label Mar 11, 2026
@bassner bassner added this to the 2.0.0 milestone Mar 13, 2026
@github-actions github-actions bot removed the stale label Mar 14, 2026
@github-actions
Copy link
Copy Markdown

There hasn't been any activity on this pull request recently. Therefore, this pull request has been automatically marked as stale and will be closed if no further activity occurs within seven days. Thank you for your contributions.

@github-actions github-actions bot added the stale label Mar 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants