Skip to content

build + o11y wip#16

Open
fn-jakubkarp wants to merge 5 commits intomainfrom
prometheus
Open

build + o11y wip#16
fn-jakubkarp wants to merge 5 commits intomainfrom
prometheus

Conversation

@fn-jakubkarp
Copy link
Member

@fn-jakubkarp fn-jakubkarp commented Feb 28, 2026

Summary by CodeRabbit

  • New Features

    • Prometheus metrics exposed and integrated; Grafana dashboards added for app and infrastructure monitoring.
  • Infrastructure Updates

    • Web server and proxy configuration reworked; Docker build/runtime layout and image base updated.
    • Docker Compose revised: service topology adjusted, Cloudflare tunnel service added, and healthchecks/ports updated.
  • Configuration

    • New environment variables for Grafana/Prometheus auth and Cloudflare tunnel; entrypoint now handles optional frontend asset staging.

@coderabbitai
Copy link

coderabbitai bot commented Feb 28, 2026

📝 Walkthrough

Walkthrough

Adds Prometheus metrics and instrumentation across HTTP middleware, SMTP, and Redis; exposes /metrics. Replaces single Caddyfile with multi-vhost app/Caddyfile (main site, Grafana, Prometheus, mail health). Introduces Grafana dashboards and provisioning, Cloudflared service, Docker/entrypoint changes, and Docker Compose volume/service adjustments.

Changes

Cohort / File(s) Summary
Environment Configuration
\.env.example
Removed sample domain/Redis comments; added GRAFANA_ADMIN_PASSWORD, CADDY_GRAFANA_AUTH, CADDY_PROMETHEUS_AUTH, CLOUDFLARE_TUNNEL_TOKEN; left DOMAIN_NAME and REDIS_PASSWORD empty.
Top-level Caddy removed / App Caddy added
Caddyfile, app/Caddyfile
Deleted the previous Caddyfile; added app/Caddyfile defining multi-vhost setup (main domain with security headers and static/proxy, grafana & prometheus vhosts with basic auth, multiple health endpoints, logging, cache rules).
App Dockerfile & Entrypoint
app/Dockerfile, app/docker-entrypoint.sh
Updated build steps (bun install/build path change), upgraded base caddy image to 2.11.1-alpine, changed asset paths (dist/srv/frontend-build), added docker-entrypoint.sh and set as ENTRYPOINT.
Backend: Prometheus integration & metrics package
backend/go.mod, backend/internal/metrics/metrics.go, backend/internal/api/router.go
Added Prometheus client dependency entries in go.mod; new metrics package declaring counters/gauges/histograms (HTTP, SMTP, rate-limit, auth failures, Redis); registered /metrics endpoint via promhttp.
Backend: Middleware instrumentation
backend/internal/api/middleware.go
Instrumented middleware: request counting, in-flight gauges, latency histograms, endpoint normalization, rate-limit and auth-failure counters; responseWriter wrapper added to capture HTTP status.
Backend: SMTP & Redis instrumentation
backend/internal/smtp/backend.go, backend/internal/store/redis.go
SMTP: increment/decrement session gauge, track emails received/rejected, record storage errors. Redis: added timing instrumentation around save/get/check operations.
Docker Compose & services
docker-compose.yml
Caddy moved to build context using app Dockerfile; added frontend_build volume; removed caddy_certs volume; backend depends_on caddy; added cloudflared service and token env; Grafana gains provisioning/dashboard mounts and no longer publishes host port; standardized healthcheck/command array formats.
Grafana dashboards & provisioning
grafana/dashboards/coresend-overview.json, grafana/dashboards/infrastructure.json, grafana/provisioning/dashboards/default.yml, grafana/provisioning/datasources/prometheus.yml
Added two Grafana dashboards (overview + infrastructure) and provisioning files: Prometheus datasource and file-based dashboard provider.
Prometheus config
prometheus.yml
Added coresend-backend scrape job targeting backend:8080/metrics with 15s interval.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Caddy
    participant Backend
    participant Middleware
    participant MetricsPkg
    participant Prometheus
    participant Grafana

    Client->>Caddy: HTTP request (static/API)
    Caddy->>Backend: Proxy /api/* or serve static
    Backend->>Middleware: Enter middleware
    Middleware->>MetricsPkg: increment in-flight, record start
    Middleware->>Backend: Call handler
    Backend->>Backend: Business logic (SMTP/DB/Redis)
    Backend->>MetricsPkg: on SMTP/Redis events (counters/histograms)
    Backend->>Middleware: Return response (status)
    Middleware->>MetricsPkg: record duration, increment totals, decrement in-flight
    Middleware->>Client: Send HTTP response

    rect rgba(0, 102, 204, 0.5)
        Prometheus->>Backend: scrape /metrics (periodic)
        Backend->>Prometheus: expose collected metrics
    end

    rect rgba(153, 102, 0, 0.5)
        Grafana->>Prometheus: query metrics (dashboards)
        Prometheus->>Grafana: return time-series data
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰
I hopped through code with nimble paws,
Counting routes and timing laws,
Dashboards bloom where metrics play,
Grafana lights the nighttime hay,
Prometheus hums — let's watch and sway. 🎋

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'build + o11y wip' is vague and uses non-descriptive abbreviations that don't clearly convey the main changes. While 'o11y' refers to observability and the PR does add Prometheus metrics and monitoring, the title lacks specificity about the primary objective. Use a more descriptive title such as 'Add Prometheus monitoring and observability infrastructure' or 'Implement metrics collection and Grafana dashboards' to better communicate the changeset's purpose.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch prometheus

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (9)
backend/internal/store/redis.go (1)

89-93: Consider consistent label naming.

Label "get_inbox" for GetEmails operation. For consistency with function names, consider "get_emails" instead, unless "get_inbox" better represents the user-facing operation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/internal/store/redis.go` around lines 89 - 93, The metrics label used
in the GetEmails function is "get_inbox" which is inconsistent with the function
name; update the metrics call inside GetEmails
(metrics.RedisOperationDuration.WithLabelValues(...)) to use "get_emails" (or
another consistent label if you prefer a user-facing name) so the metric label
matches the operation identifier used by the GetEmails method.
backend/internal/api/middleware.go (2)

182-189: Misleading function name: checks hex only, not alphanumeric.

isHexOrAlphanumeric only validates hexadecimal characters (0-9, a-f, A-F), not true alphanumeric (which would include g-z). Consider renaming for clarity.

📝 Suggested rename
-func isHexOrAlphanumeric(s string) bool {
+func isHexString(s string) bool {
 	for _, c := range s {
 		if !((c >= '0' && c <= '9') || (c >= 'a' && c <= 'f') || (c >= 'A' && c <= 'F')) {
 			return false
 		}
 	}
 	return true
 }

Also update the call site at line 171:

-		if len(part) >= 32 || (len(part) > 8 && isHexOrAlphanumeric(part)) {
+		if len(part) >= 32 || (len(part) > 8 && isHexString(part)) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/internal/api/middleware.go` around lines 182 - 189, The function
isHexOrAlphanumeric is misnamed because it only allows hexadecimal characters;
rename it to isHexadecimal (or isHexString/isHexOnly) and update every call site
that references isHexOrAlphanumeric to use the new name (including the caller
that currently invokes it in the middleware validation logic). Also update any
comments or docstrings near the function and its usage to reflect that it
validates hex digits only, not all alphanumeric characters.

162-180: Endpoint normalization logic may be overly aggressive.

Line 171: Any path segment ≥32 chars is replaced with {id} regardless of content. This could potentially normalize legitimate long path segments. If this is intentional for cardinality control, the behavior is acceptable; otherwise, consider constraining to hex-only for all lengths.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/internal/api/middleware.go` around lines 162 - 180, normalizeEndpoint
is too aggressive because it replaces any segment with length >= 32 with "{id}",
possibly normalizing legitimate long segments; update the condition in
normalizeEndpoint to only replace segments when they match stricter identifier
patterns (e.g., hex-only of 32+ chars or explicit UUID regex) rather than purely
length-based checks, using or adding a helper like isHexOrUUID (instead of the
current isHexOrAlphanumeric) to detect address/hash segments and apply parts[i]
= "{id}" only when that stricter check returns true.
app/Caddyfile (2)

61-65: Inconsistent indentation in the mail health endpoint block.

Lines 62-63 have extra leading spaces compared to the closing brace on line 64.

🧹 Proposed fix
 http://mail.{$DOMAIN_NAME} {
-        respond 200 {
+    respond 200 {
         body "mail healthy"
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/Caddyfile` around lines 61 - 65, The mail health endpoint block starting
with "http://mail.{$DOMAIN_NAME} {" has inconsistent indentation for the inner
lines ("respond 200" and "body \"mail healthy\""); fix by aligning those inner
lines with the opening brace (same indent level) and ensuring the closing brace
lines up as well so the block has consistent indentation and formatting in the
Caddyfile.

16-16: HSTS header has no effect over plain HTTP.

Strict-Transport-Security is only honored by browsers when served over HTTPS. On an HTTP-only vhost, this header is ignored. If HTTPS will be added later (via Caddy's automatic TLS), this is fine; otherwise, consider removing it to avoid confusion.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/Caddyfile` at line 16, The HSTS header "Strict-Transport-Security" in the
Caddyfile is ineffective on plain HTTP and should either be removed from the
HTTP-only vhost or applied conditionally for HTTPS only; update the Caddyfile so
the Strict-Transport-Security directive is not sent for non-TLS sites (remove
the line or move it into the HTTPS site/block or a TLS-only header block) to
ensure HSTS is only emitted when requests are served over TLS.
grafana/dashboards/coresend-overview.json (1)

120-131: Error rate may display NaN when there's no traffic.

The query 100 * sum(...) / sum(...) will return NaN if the denominator is zero (no requests). Grafana handles this gracefully, but you could use or vector(0) or a > 0 filter for a cleaner display during low-traffic periods.

🔧 Optional: Handle zero-traffic case
"expr": "100 * sum(rate(http_requests_total{status=~\"5..\"}[5m])) / (sum(rate(http_requests_total[5m])) > 0) or vector(0)"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@grafana/dashboards/coresend-overview.json` around lines 120 - 131, The "Error
Rate (%)" gauge panel uses the expression `100 *
sum(rate(http_requests_total{status=~"5.."}[5m])) /
sum(rate(http_requests_total[5m]))` which can produce NaN when the denominator
is zero; update the panel's expr to guard against zero-traffic (for example wrap
the denominator with `(sum(rate(http_requests_total[5m])) > 0) or vector(0)` or
use `or vector(0)` on the whole division) so the gauge shows 0 instead of NaN
during low/no traffic.
backend/internal/smtp/backend.go (1)

59-66: Consider adding a rejection metric for Redis errors.

The Redis lookup error path (lines 59-65) returns a temporary error but does not increment SMTPEmailsRejectedTotal. Other rejection paths (invalid format, inactive address, storage error) all track metrics. For consistency and observability, consider adding a metric here—perhaps with a "redis_error" label or a separate counter for temporary failures.

🔧 Proposed fix
 	if err != nil {
 		log.Printf("Redis error checking address %s: %v", localPart, err)
+		metrics.SMTPEmailsRejectedTotal.WithLabelValues("redis_lookup_error").Inc()
 		return &gosmtp.SMTPError{
 			Code:         451,
 			EnhancedCode: gosmtp.EnhancedCode{4, 3, 0},
 			Message:      "Temporary server error, please try again later",
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/internal/smtp/backend.go` around lines 59 - 66, Add a metric
increment for Redis lookup failures to match other rejection paths: inside the
Redis error branch where it currently logs and returns the gosmtp.SMTPError (the
block handling err for the Redis check referencing localPart and returning
gosmtp.SMTPError with Code 451), call
SMTPEmailsRejectedTotal.WithLabelValues("redis_error").Inc() (or use the same
metric plus a new label value "redis_error") before returning so Redis errors
are counted like other rejection reasons.
docker-compose.yml (1)

29-31: Hardcoded Let's Encrypt path is brittle; trailing whitespace on line 31.

The certificate paths assume the exact ACME directory structure. If using Let's Encrypt staging or a different CA, paths will break. Consider using Caddy's file_server or environment variables for flexibility.

Also, line 31 has a trailing space after the value.

🧹 Fix trailing whitespace
-      - STATIC_DIR=/srv/frontend 
+      - STATIC_DIR=/srv/frontend
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` around lines 29 - 31, The SMTP_CERT_PATH and
SMTP_KEY_PATH environment variables hardcode a specific Let's Encrypt ACME
directory which is brittle; replace the fixed ACME path by deriving the cert/key
paths from a configurable base (e.g., introduce and use a CERT_BASE_DIR or
CADDY_CERT_DIR env var) and construct SMTP_CERT_PATH and SMTP_KEY_PATH from that
variable (so different CAs/staging work), and also remove the trailing
whitespace after the STATIC_DIR value; update references to SMTP_CERT_PATH,
SMTP_KEY_PATH, and STATIC_DIR accordingly.
app/Dockerfile (1)

4-4: Consider using --frozen-lockfile for reproducible builds.

Without --frozen-lockfile, the install step may update dependencies if bun.lock drifts from package.json, leading to non-reproducible builds in CI. This is especially important for production images.

🔧 Proposed fix
-RUN bun install
+RUN bun install --frozen-lockfile
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/Dockerfile` at line 4, Replace the non-deterministic install step so the
Dockerfile's RUN bun install uses Bun's frozen lock option; update the RUN bun
install invocation to include the --frozen-lockfile flag to ensure builds fail
if bun.lock and package.json diverge (locate the RUN bun install line in the
Dockerfile and change it to use the --frozen-lockfile option).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/Caddyfile`:
- Around line 5-9: The Caddyfile currently serves the root path block
"http://localhost" with respond 200 and body "healthy", but the docker-compose
healthcheck probes /health; update the Caddyfile so the site block matches
/health (e.g., add a path matcher for /health or change the site address to
include the /health route) so that the respond 200 and body "healthy" are
returned for requests to /health; alternatively, if you prefer to change the
healthcheck, update the docker-compose healthcheck URL to root (/). Ensure the
change touches the existing http://localhost block and the respond 200 / body
"healthy" stanza so healthcheck requests to /health succeed.

In `@docker-compose.yml`:
- Around line 36-37: The backend is mounting the named volume frontend_build
which is currently empty because no service populates it; ensure the frontend
build artifacts are written into that volume before backend starts by adding a
producer step—either have the frontend service build and mount the same volume
(frontend service writes build output into volume frontend_build), add an
init/container (e.g., a one-off copy job) that copies the frontend dist into
frontend_build before backend starts, or change the backend mount to a bind
mount of the host build directory if you already produce files on the host;
update docker-compose to make the frontend service produce into frontend_build
and add depends_on or an init step so backend gets populated assets.
- Around line 46-62: The caddy service mounts the named volume frontend_build
onto /srv/frontend which shadows the built assets copied into the Docker image;
remove the frontend_build volume mount from the caddy service in docker-compose
(or replace it with a bind mount to your local build output or a named volume
that you populate in a separate build step) so that /srv/frontend inside the
caddy container uses the image-copied assets instead of an empty volume; update
the caddy service volumes block accordingly and, if you choose the bind-mount or
populate-a-volume option, ensure the host path or population step matches the
Dockerfile output path (/srv/frontend).

---

Nitpick comments:
In `@app/Caddyfile`:
- Around line 61-65: The mail health endpoint block starting with
"http://mail.{$DOMAIN_NAME} {" has inconsistent indentation for the inner lines
("respond 200" and "body \"mail healthy\""); fix by aligning those inner lines
with the opening brace (same indent level) and ensuring the closing brace lines
up as well so the block has consistent indentation and formatting in the
Caddyfile.
- Line 16: The HSTS header "Strict-Transport-Security" in the Caddyfile is
ineffective on plain HTTP and should either be removed from the HTTP-only vhost
or applied conditionally for HTTPS only; update the Caddyfile so the
Strict-Transport-Security directive is not sent for non-TLS sites (remove the
line or move it into the HTTPS site/block or a TLS-only header block) to ensure
HSTS is only emitted when requests are served over TLS.

In `@app/Dockerfile`:
- Line 4: Replace the non-deterministic install step so the Dockerfile's RUN bun
install uses Bun's frozen lock option; update the RUN bun install invocation to
include the --frozen-lockfile flag to ensure builds fail if bun.lock and
package.json diverge (locate the RUN bun install line in the Dockerfile and
change it to use the --frozen-lockfile option).

In `@backend/internal/api/middleware.go`:
- Around line 182-189: The function isHexOrAlphanumeric is misnamed because it
only allows hexadecimal characters; rename it to isHexadecimal (or
isHexString/isHexOnly) and update every call site that references
isHexOrAlphanumeric to use the new name (including the caller that currently
invokes it in the middleware validation logic). Also update any comments or
docstrings near the function and its usage to reflect that it validates hex
digits only, not all alphanumeric characters.
- Around line 162-180: normalizeEndpoint is too aggressive because it replaces
any segment with length >= 32 with "{id}", possibly normalizing legitimate long
segments; update the condition in normalizeEndpoint to only replace segments
when they match stricter identifier patterns (e.g., hex-only of 32+ chars or
explicit UUID regex) rather than purely length-based checks, using or adding a
helper like isHexOrUUID (instead of the current isHexOrAlphanumeric) to detect
address/hash segments and apply parts[i] = "{id}" only when that stricter check
returns true.

In `@backend/internal/smtp/backend.go`:
- Around line 59-66: Add a metric increment for Redis lookup failures to match
other rejection paths: inside the Redis error branch where it currently logs and
returns the gosmtp.SMTPError (the block handling err for the Redis check
referencing localPart and returning gosmtp.SMTPError with Code 451), call
SMTPEmailsRejectedTotal.WithLabelValues("redis_error").Inc() (or use the same
metric plus a new label value "redis_error") before returning so Redis errors
are counted like other rejection reasons.

In `@backend/internal/store/redis.go`:
- Around line 89-93: The metrics label used in the GetEmails function is
"get_inbox" which is inconsistent with the function name; update the metrics
call inside GetEmails (metrics.RedisOperationDuration.WithLabelValues(...)) to
use "get_emails" (or another consistent label if you prefer a user-facing name)
so the metric label matches the operation identifier used by the GetEmails
method.

In `@docker-compose.yml`:
- Around line 29-31: The SMTP_CERT_PATH and SMTP_KEY_PATH environment variables
hardcode a specific Let's Encrypt ACME directory which is brittle; replace the
fixed ACME path by deriving the cert/key paths from a configurable base (e.g.,
introduce and use a CERT_BASE_DIR or CADDY_CERT_DIR env var) and construct
SMTP_CERT_PATH and SMTP_KEY_PATH from that variable (so different CAs/staging
work), and also remove the trailing whitespace after the STATIC_DIR value;
update references to SMTP_CERT_PATH, SMTP_KEY_PATH, and STATIC_DIR accordingly.

In `@grafana/dashboards/coresend-overview.json`:
- Around line 120-131: The "Error Rate (%)" gauge panel uses the expression `100
* sum(rate(http_requests_total{status=~"5.."}[5m])) /
sum(rate(http_requests_total[5m]))` which can produce NaN when the denominator
is zero; update the panel's expr to guard against zero-traffic (for example wrap
the denominator with `(sum(rate(http_requests_total[5m])) > 0) or vector(0)` or
use `or vector(0)` on the whole division) so the gauge shows 0 instead of NaN
during low/no traffic.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 94be8ee and 9466fea.

⛔ Files ignored due to path filters (2)
  • backend/go.sum is excluded by !**/*.sum
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (16)
  • .env.example
  • Caddyfile
  • app/Caddyfile
  • app/Dockerfile
  • backend/go.mod
  • backend/internal/api/middleware.go
  • backend/internal/api/router.go
  • backend/internal/metrics/metrics.go
  • backend/internal/smtp/backend.go
  • backend/internal/store/redis.go
  • docker-compose.yml
  • grafana/dashboards/coresend-overview.json
  • grafana/dashboards/infrastructure.json
  • grafana/provisioning/dashboards/default.yml
  • grafana/provisioning/datasources/prometheus.yml
  • prometheus.yml
💤 Files with no reviewable changes (1)
  • Caddyfile

Comment on lines +5 to +9
http://localhost {
respond 200 {
body "healthy"
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Healthcheck endpoint mismatch will cause container health failures.

The docker-compose.yml healthcheck uses http://localhost/health, but this Caddyfile only responds on the root path (/). The healthcheck will fail with 404.

Either update this endpoint to /health or change the docker-compose healthcheck path.

🐛 Proposed fix: Add `/health` path matcher
 http://localhost {
-    respond 200 {
-        body "healthy"
-    }
+    respond /health 200 {
+        body "healthy"
+    }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
http://localhost {
respond 200 {
body "healthy"
}
}
http://localhost {
respond /health 200 {
body "healthy"
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/Caddyfile` around lines 5 - 9, The Caddyfile currently serves the root
path block "http://localhost" with respond 200 and body "healthy", but the
docker-compose healthcheck probes /health; update the Caddyfile so the site
block matches /health (e.g., add a path matcher for /health or change the site
address to include the /health route) so that the respond 200 and body "healthy"
are returned for requests to /health; alternatively, if you prefer to change the
healthcheck, update the docker-compose healthcheck URL to root (/). Ensure the
change touches the existing http://localhost block and the respond 200 / body
"healthy" stanza so healthcheck requests to /health succeed.

Comment on lines +36 to +37
- caddy_data:/caddy_data:ro
- frontend_build:/srv/frontend:ro
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Backend mounts frontend_build volume but it may be empty.

If the frontend_build volume is meant to share frontend assets between services, ensure there's a mechanism to populate it. Currently, the volume is defined as a local driver volume with no initialization.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` around lines 36 - 37, The backend is mounting the named
volume frontend_build which is currently empty because no service populates it;
ensure the frontend build artifacts are written into that volume before backend
starts by adding a producer step—either have the frontend service build and
mount the same volume (frontend service writes build output into volume
frontend_build), add an init/container (e.g., a one-off copy job) that copies
the frontend dist into frontend_build before backend starts, or change the
backend mount to a bind mount of the host build directory if you already produce
files on the host; update docker-compose to make the frontend service produce
into frontend_build and add depends_on or an init step so backend gets populated
assets.

Comment on lines +46 to +62
caddy:
image: caddy:alpine
build:
context: ./app
dockerfile: Dockerfile
restart: unless-stopped
ports:
- "80:80"
- "443:443"
- "443:443/udp"
env_file:
- .env
environment:
- DOMAIN_NAME=${DOMAIN_NAME}
volumes:
- ./caddy/Caddyfile:/etc/caddy/Caddyfile:ro
- ./frontend/dist:/srv/frontend:ro
- caddy_data:/data
- caddy_certs:/data/caddy/certificates/acme-v02.api.letsencrypt.org-directory
- caddy_config:/config
- frontend_build:/srv/frontend
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

frontend_build volume will shadow the built frontend assets.

The Caddy Dockerfile copies built assets to /srv/frontend in the image (line 9 of app/Dockerfile). However, mounting an empty frontend_build volume at /srv/frontend (line 62) will shadow those assets, resulting in an empty directory and broken frontend.

Either:

  1. Remove the frontend_build volume from the caddy service (if only backend needs it)
  2. Use a named volume with a separate build step that populates it
  3. Use a bind mount to a local directory containing the built assets
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` around lines 46 - 62, The caddy service mounts the named
volume frontend_build onto /srv/frontend which shadows the built assets copied
into the Docker image; remove the frontend_build volume mount from the caddy
service in docker-compose (or replace it with a bind mount to your local build
output or a named volume that you populate in a separate build step) so that
/srv/frontend inside the caddy container uses the image-copied assets instead of
an empty volume; update the caddy service volumes block accordingly and, if you
choose the bind-mount or populate-a-volume option, ensure the host path or
population step matches the Dockerfile output path (/srv/frontend).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/docker-entrypoint.sh`:
- Around line 11-12: The entrypoint comment promises honoring "provided
arguments" but the script always runs the default Caddy command; change
docker-entrypoint.sh to check for provided CLI args and exec them when present
(use "$@" when args exist) and only fall back to exec caddy run --config
/etc/caddy/Caddyfile --adapter caddyfile when no arguments were passed; update
the logic around the existing exec caddy run invocation to conditionally run
either "$@" or the default.
- Around line 7-8: The script currently suppresses cp errors and always prints
success; change the cp handling in docker-entrypoint.sh so you run cp -r
/srv/frontend-build/* /srv/frontend/ without redirecting stderr, capture its
exit status, and only print "Frontend assets copied to shared volume" when the
cp command succeeds; if cp fails, log a clear error (including the cp exit
status or stderr) and exit non‑zero instead of silently continuing.

In `@docker-compose.yml`:
- Around line 35-36: There is a circular depends_on between the services named
"backend" and "caddy" that will prevent Docker Compose from starting them; break
the cycle by removing one side of the dependency (either delete the caddy ->
backend depends_on entry or delete the backend -> caddy depends_on entry) and,
if sequential startup is required, rely on a proper healthcheck-based depends_on
condition (service_healthy) on only one service; update both occurrences (the
one shown and the similar entries around lines 65-66) so only a single directed
depends_on relationship remains.
- Line 130: Remove the insecure fallback default for GF_SECURITY_ADMIN_PASSWORD
in docker-compose.yml: do not use the ":-admin" fallback for
GRAFANA_ADMIN_PASSWORD; instead require the environment variable to be set (or
supply via Docker secret) so the container fails to start if no admin password
is provided, and update references to
GF_SECURITY_ADMIN_PASSWORD/GRAFANA_ADMIN_PASSWORD accordingly.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9466fea and 0608054.

📒 Files selected for processing (3)
  • app/Dockerfile
  • app/docker-entrypoint.sh
  • docker-compose.yml

Comment on lines +7 to +8
cp -r /srv/frontend-build/* /srv/frontend/ 2>/dev/null || true
echo "Frontend assets copied to shared volume"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Do not suppress copy failures while logging success.

Line 7 discards all copy errors, and Line 8 still logs success. This can silently serve stale/empty frontend assets.

Proposed fix
 if [ -d "/srv/frontend-build" ]; then
-    cp -r /srv/frontend-build/* /srv/frontend/ 2>/dev/null || true
+    mkdir -p /srv/frontend
+    cp -a /srv/frontend-build/. /srv/frontend/
     echo "Frontend assets copied to shared volume"
 fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cp -r /srv/frontend-build/* /srv/frontend/ 2>/dev/null || true
echo "Frontend assets copied to shared volume"
mkdir -p /srv/frontend
cp -a /srv/frontend-build/. /srv/frontend/
echo "Frontend assets copied to shared volume"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/docker-entrypoint.sh` around lines 7 - 8, The script currently suppresses
cp errors and always prints success; change the cp handling in
docker-entrypoint.sh so you run cp -r /srv/frontend-build/* /srv/frontend/
without redirecting stderr, capture its exit status, and only print "Frontend
assets copied to shared volume" when the cp command succeeds; if cp fails, log a
clear error (including the cp exit status or stderr) and exit non‑zero instead
of silently continuing.

Comment on lines +11 to +12
# Run Caddy with the provided arguments, or default to running with Caddyfile
exec caddy run --config /etc/caddy/Caddyfile --adapter caddyfile
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Entrypoint comment and behavior are inconsistent about runtime arguments.

Line 11 states “provided arguments,” but Line 12 always runs the default Caddy command and ignores $@.

Proposed fix
-# Run Caddy with the provided arguments, or default to running with Caddyfile
-exec caddy run --config /etc/caddy/Caddyfile --adapter caddyfile
+# Run provided command if any, otherwise default to Caddyfile
+if [ "$#" -gt 0 ]; then
+    exec "$@"
+fi
+exec caddy run --config /etc/caddy/Caddyfile --adapter caddyfile
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Run Caddy with the provided arguments, or default to running with Caddyfile
exec caddy run --config /etc/caddy/Caddyfile --adapter caddyfile
# Run provided command if any, otherwise default to Caddyfile
if [ "$#" -gt 0 ]; then
exec "$@"
fi
exec caddy run --config /etc/caddy/Caddyfile --adapter caddyfile
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/docker-entrypoint.sh` around lines 11 - 12, The entrypoint comment
promises honoring "provided arguments" but the script always runs the default
Caddy command; change docker-entrypoint.sh to check for provided CLI args and
exec them when present (use "$@" when args exist) and only fall back to exec
caddy run --config /etc/caddy/Caddyfile --adapter caddyfile when no arguments
were passed; update the logic around the existing exec caddy run invocation to
conditionally run either "$@" or the default.

Comment on lines +35 to +36
caddy:
condition: service_started
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Circular depends_on between backend and caddy will break startup.

backend -> caddy and caddy -> backend create a dependency cycle, so compose orchestration can fail before services become healthy.

Proposed fix (break the cycle)
   backend:
@@
     depends_on:
       redis:
         condition: service_healthy
-      caddy:
-        condition: service_started

Also applies to: 65-66

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` around lines 35 - 36, There is a circular depends_on
between the services named "backend" and "caddy" that will prevent Docker
Compose from starting them; break the cycle by removing one side of the
dependency (either delete the caddy -> backend depends_on entry or delete the
backend -> caddy depends_on entry) and, if sequential startup is required, rely
on a proper healthcheck-based depends_on condition (service_healthy) on only one
service; update both occurrences (the one shown and the similar entries around
lines 65-66) so only a single directed depends_on relationship remains.

- "3000:3000"
environment:
- GF_AUTH_ANONYMOUS_ENABLED=false
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_ADMIN_PASSWORD:-admin}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid a default Grafana admin password.

Line 130 falls back to admin, which is a weak default credential.

Proposed fix
-      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_ADMIN_PASSWORD:-admin}
+      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_ADMIN_PASSWORD:?set GRAFANA_ADMIN_PASSWORD}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` at line 130, Remove the insecure fallback default for
GF_SECURITY_ADMIN_PASSWORD in docker-compose.yml: do not use the ":-admin"
fallback for GRAFANA_ADMIN_PASSWORD; instead require the environment variable to
be set (or supply via Docker secret) so the container fails to start if no admin
password is provided, and update references to
GF_SECURITY_ADMIN_PASSWORD/GRAFANA_ADMIN_PASSWORD accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant