Skip to content

Conversation

@mrcfps
Copy link
Contributor

@mrcfps mrcfps commented Dec 19, 2025

  • Updated Dockerfile to utilize multi-stage builds for improved caching and reduced image size.
  • Implemented a cache optimization strategy using turbo prune and pnpm for dependency management.
  • Enhanced build steps to fetch dependencies and generate Prisma client efficiently.
  • Ensured compliance with coding standards, including proper use of environment variables and comments for clarity.

Summary by CodeRabbit

  • Chores
    • Optimized Docker build process with enhanced caching strategy for improved build efficiency and faster deployment cycles.

✏️ Tip: You can customize this high-level summary in your review settings.

- Updated Dockerfile to utilize multi-stage builds for improved caching and reduced image size.
- Implemented a cache optimization strategy using turbo prune and pnpm for dependency management.
- Enhanced build steps to fetch dependencies and generate Prisma client efficiently.
- Ensured compliance with coding standards, including proper use of environment variables and comments for clarity.
@coderabbitai
Copy link

coderabbitai bot commented Dec 19, 2025

Walkthrough

The PR rewrites the API Dockerfile into a multi-stage build with cache optimization, introducing Pruner, Builder, and Production stages. Dependency management now uses pnpm with offline installation, turbo cache mounts for builds, and explicit wkhtmltopdf binary handling.

Changes

Cohort / File(s) Summary
Docker Multi-Stage Build Optimization
apps/api/Dockerfile
Restructured into explicit cache-optimized multi-stage workflow: Pruner stage for turbo dependency pruning, Builder stage with pnpm fetch and offline install, Production stage with system dependencies and binary setup. Environment variables adjusted (npm_config_gyp_ignore, CYPRESS_INSTALL_BINARY, NODE_OPTIONS). Build steps reordered to prioritize cache layers.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Build logic verification: Ensure multi-stage layers correctly optimize cache hits and build times
  • Dependency management: Validate turbo prune, pnpm fetch, and offline install sequence
  • Environment variable changes: Confirm NODE_OPTIONS, CYPRESS_INSTALL_BINARY, and npm_config_gyp adjustments don't break builds
  • Production image integrity: Verify wkhtmltopdf binary handling and final CMD execution

Possibly related PRs

Suggested reviewers

  • lefarcen
  • CH1111

Poem

🐰 Layers stacked with care and grace,
Cache mounts speeding up the race,
Turbo prunes the dependency tree,
Faster builds than ever we see!
Multi-stages, oh so fine,
Docker magic, line by line!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and clearly summarizes the main change: optimizing the Dockerfile through multi-stage builds and cache efficiency improvements, which directly matches the changeset's primary modifications.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch chore/optimize-docker-cache

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
apps/api/Dockerfile (2)

16-20: Consider pinning turbo version for reproducible builds.

The global turbo install doesn't specify a version, which could lead to different behavior across builds if turbo releases a breaking change.

🔎 Proposed fix
 FROM node:20.19.1-alpine3.20 AS pruner
-RUN npm install -g turbo
+RUN npm install -g turbo@2
 WORKDIR /app

101-107: Consider pruning node_modules for production.

The builder stage installs all dependencies (including devDependencies since NODE_ENV=production isn't set during install). Copying the full node_modules to production may include unnecessary development dependencies, increasing image size.

Consider one of these approaches:

  1. Run pnpm prune --prod before copying, or
  2. Use pnpm deploy for production-only dependencies, or
  3. Set NODE_ENV=production before pnpm install (though this may break build tools)
🔎 Proposed approach using pnpm deploy

After the build step, you could add:

# Create production-ready deployment
RUN pnpm deploy --filter @refly/api --prod /app/deploy

Then in the production stage, copy from /app/deploy instead of multiple directories.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8319a24 and 3fe5ec9.

📒 Files selected for processing (1)
  • apps/api/Dockerfile (2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: refly-ai/refly PR: 0
File: .cursor/rules/08-contributing-guidelines.mdc:0-0
Timestamp: 2025-11-26T05:04:26.523Z
Learning: Install dependencies including Docker, Docker Compose, Node.js v20.x (LTS), and PNPM package manager
📚 Learning: 2025-11-26T05:04:26.523Z
Learnt from: CR
Repo: refly-ai/refly PR: 0
File: .cursor/rules/08-contributing-guidelines.mdc:0-0
Timestamp: 2025-11-26T05:04:26.523Z
Learning: Install dependencies including Docker, Docker Compose, Node.js v20.x (LTS), and PNPM package manager

Applied to files:

  • apps/api/Dockerfile
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build / Build
🔇 Additional comments (5)
apps/api/Dockerfile (5)

1-14: Well-documented cache strategy.

The BuildKit syntax directive and cache optimization documentation provide excellent context for maintainers. This is a good practice for complex Dockerfiles.


34-37: Good use of pnpm fetch with cache mount.

The pnpm fetch step with the cache mount correctly implements the cache-first strategy. This ensures dependencies are downloaded using only the lockfile, maximizing cache hits.


53-56: Good cache mount for turbo builds.

The turbo cache mount at /app/.turbo enables incremental builds, which will significantly speed up rebuilds when only parts of the codebase change.


58-84: Well-structured production stage.

The wkhtmltopdf binary stage uses a pinned version, and the production stage consolidates all system dependencies into a single RUN layer with proper cache cleanup via --no-cache. Good practices throughout.


115-117: LGTM!

The CMD correctly uses the exec form, and the relative paths align with the WORKDIR set on line 113.

Comment on lines +46 to +51
# Step 4: Copy full source code (pruned to only what's needed)
COPY --from=pruner /app/out/full/ .

# Step 5: Copy prisma schema and generate client
COPY apps/api/prisma ./apps/api/prisma
RUN cd apps/api && pnpm exec prisma generate
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove the redundant COPY of the prisma directory.

Line 50 copies apps/api/prisma, but this is already included in the pruned source copied on line 47. Since turbo prune outputs "the full source code of all internal packages needed to build the target" along with a pruned lockfile, the prisma schema is already present. The separate COPY is redundant and causes unnecessary Docker cache invalidation when prisma files change. Delete line 50 and adjust line 51 accordingly, or consolidate the operations.

🤖 Prompt for AI Agents
In apps/api/Dockerfile around lines 46 to 51, remove the redundant "COPY
apps/api/prisma ./apps/api/prisma" since the pruned full source copied on line
47 already includes the prisma schema; delete that COPY and adjust the following
RUN line to run "cd apps/api && pnpm exec prisma generate" immediately after the
full-source COPY (or consolidate into a single RUN if you prefer), ensuring no
extra COPY is left that would invalidate the Docker cache when prisma files
change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants