Skip to content

Devops#35

Merged
brunoalbin23 merged 41 commits intomainfrom
devops
Dec 17, 2025
Merged

Devops#35
brunoalbin23 merged 41 commits intomainfrom
devops

Conversation

@Locatelli-Flor
Copy link
Copy Markdown
Contributor

No description provided.

Sets up Docker Compose for local development.

Defines services for the API and database (PostgreSQL with pgvector).
Configures environment variables, volumes, and health checks for both services.
Also includes a Dockerfile that uses uv to manage the python environment and dependencies.
…Manager and RAGManager, update docker-compose.yml for service configuration, and adjust Python version in RAGManager.
Sets up GitHub Actions workflows for continuous integration and continuous deployment.

- Introduces a deployment workflow that builds and pushes Docker images to ACR, configures kubectl, and restarts deployments in a Kubernetes namespace.
- Implements a pull request validation workflow that performs secret scanning with Gitleaks, builds Docker images for validation (without pushing), runs Trivy vulnerability scans, and uploads the results to GitHub Security.
- Adds a PR summary workflow that posts a comment on the pull request with the results of the Gitleaks and build validation jobs, including a notice to check the security tab for any found vulnerabilities.
Streamlines the PR validation workflow by removing the Gitleaks job and improving the presentation of Trivy results.

The workflow now focuses on build validation and vulnerability scanning with clearer output in the PR summary. Trivy results are now displayed in a table format within the PR comment, and a direct link to the detailed results in the Actions tab is included. The Gitleaks check is removed.
Adds deployment summary to the workflow, providing detailed information about the deployed service, image, and pod status in the job summary.

Also, it includes a success notification with links to deployed services and sets fail-fast to false to ensure all services are deployed.
Improves the deployment process by adding rollback capabilities on failure, enhanced logging, and deployment summaries in GitHub.
The changes also include updating the deployment strategy from rolling restarts to image updates.
It adds timeout configurations for deployments.
Also adds live URL information to success summary.
Copilot AI review requested due to automatic review settings December 14, 2025 23:38
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 14, 2025

Warning

Rate limit exceeded

@Locatelli-Flor has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 22 minutes and 10 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 9562f67 and 34d8a48.

📒 Files selected for processing (1)
  • .github/workflows/deploy.yml (5 hunks)

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Replaces restart-based deployment with image-based updates and robust rollout/wait/logging in CI; adds Discord PR/deploy notifications; tightens Python ranges and dependencies; updates Dockerfiles to python:3.12-slim with remote uv install, uv sync and HEALTHCHECKs; small import/path adjustments in RAGManager.

Changes

Cohort / File(s) Change Summary
Deployment workflow
\.github/workflows/deploy.yml
Switched from rollout restart to kubectl set image; added rollout wait loop (env: ROLLOUT_TIMEOUT, READY_CHECK_RETRIES, READY_CHECK_SLEEP), non-blocking status checks, pod readiness polling, pod logs on failure, kubectl rollout undo on timeout, per-service Markdown summary, and Discord success/failure notifications; job timeout set to 15m; disabled build-push provenance.
PR notifications workflow
\.github/workflows/pr-validation.yml
Added discord-pr-notify job that posts PR title, URL, and author to a Discord webhook on pull_request events using curl.
DocsManager packaging & runtime
DocsManager/pyproject.toml, DocsManager/Dockerfile, DocsManager/.python-version
Tightened Python to >=3.12,<3.14; added DB deps (sqlalchemy, asyncpg, psycopg2-binary); added start script; updated Ruff configs; Dockerfile base → python:3.12-slim, installs curl, uses remote uv installer + uv sync --no-dev --no-cache, exposes 8000, adds HEALTHCHECK, updates CMD. .python-version changed to 3.13.
RAGManager packaging & runtime
RAGManager/pyproject.toml, RAGManager/Dockerfile
Renamed project metadata to rag-manager; tightened Python to >=3.12,<3.14; added langchain*, pdfplumber, minio deps; adjusted Ruff target-version; Dockerfile base → python:3.12-slim, installs curl, uses remote uv installer + uv sync --no-dev --no-cache, adds HEALTHCHECK, updates CMD.
Code import adjustments
RAGManager/app/services/chunking_service.py, RAGManager/app/services/pipeline.py
Replaced from langchain.text_splitter import RecursiveCharacterTextSplitter with from langchain_text_splitters import RecursiveCharacterTextSplitter; aliased split_documents to document_to_chunks in pipeline.py to preserve existing usage.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant GH as GitHub Actions
participant Registry as Container Registry
participant K8s as Kubernetes API
participant Pods as Cluster Pods
GH->>Registry: Build & push image (tag = github.sha)
GH->>K8s: kubectl set image deployment= =repo:github.sha
par Rollout waiting (until ROLLOUT_TIMEOUT)
K8s->>Pods: Schedule new pods (new image)
Pods-->>K8s: Report Ready / NotReady states
K8s-->>GH: kubectl get deployment/pods (status)
GH->>K8s: Loop: kubectl get pods -> check availableReplicas (READY_CHECK_RETRIES, sleep READY_CHECK_SLEEP)
end
alt All pods Ready before timeout
GH->>GH: Print deployment summary and send success notifications (Discord, live URLs)
else Timeout or failure
GH->>K8s: kubectl describe deployment / kubectl get pods --namespace ... || true
GH->>K8s: kubectl logs (collect logs)
GH->>K8s: kubectl rollout undo deployment/
GH->>GH: Print failure summary and send failure notification (Discord)
end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Review .github/workflows/deploy.yml for correct set-image usage, timeout/wait loop, failure handling and safe rollback.
  • Validate Dockerfile changes (uv install script, uv sync placement, HEALTHCHECK syntax, final CMD) for both services.
  • Confirm pyproject dependency additions and Python range changes match runtime image and CI.
  • Verify import change in chunking_service.py and aliasing in pipeline.py resolve to compatible types/behaviors.

Possibly related PRs

  • Devops cositas #5 — Modifies the same CI/CD workflows and overlaps on deploy workflow changes (restart vs image-update).

Suggested reviewers

  • JuanFKurucz

Poem

🐰 I hopped through builds and tags tonight,

swapped restarts for images shining bright.
I waited while the pods woke, one by one,
fetched logs when things were not quite done.
A carrot-cheer — deployment's now delight! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (2 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The PR title 'Devops' is vague and generic; it does not provide meaningful information about the specific changes made in the changeset. Replace with a more specific title that describes the main change, such as 'Refactor deployment workflow and update dependencies' or similar.
Description check ❓ Inconclusive No pull request description was provided by the author, making it impossible to assess whether the description relates to the changeset. Add a pull request description that explains the purpose and scope of the changes, including deployment workflow improvements and dependency updates.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

🔍 PR Validation Results

Check Status
Build ✅ success
Trivy Check Security tab

View detailed results

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the DevOps configuration for the project, focusing on Python version compatibility, deployment workflow improvements, and project metadata updates.

Key Changes:

  • Updated Python version requirements for RAGManager (3.12-3.13) and DocsManager (3.13) from the previously specified 3.14
  • Enhanced GitHub Actions deployment workflow with better error handling, rollback capabilities, and improved logging
  • Updated project metadata and removed unnecessary dependencies

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
RAGManager/pyproject.toml Updated project name, description, Python version constraint, and removed deprecated dependencies
DocsManager/pyproject.toml Updated Python version requirement and added database-related dependencies
.github/workflows/deploy.yml Enhanced deployment workflow with timeout configurations, failure handling, rollback mechanism, and improved status reporting

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
DocsManager/pyproject.toml (1)

7-19: Remove duplicate dependencies with conflicting version constraints.

The dependencies list contains duplicates that will confuse pip's resolver and future maintainers:

  • sqlalchemy appears at lines 9 (>=2.0.35) and 18 (>=2.0.0)
  • psycopg2-binary appears at lines 11 (>=2.9.9) and 15 (>=2.9.0)

Consolidate to a single entry per package with the most restrictive version constraint.

 dependencies = [
     "fastapi>=0.124.2",
-    "sqlalchemy>=2.0.35",
     "asyncpg>=0.29.0",
-    "psycopg2-binary>=2.9.9",
     "langgraph>=1.0.4",
     "typing-extensions>=4.15.0",
     "uvicorn>=0.38.0",
-    "psycopg2-binary>=2.9.0",
     "pgvector>=0.3.0",
     "pydantic-settings>=2.0.0",
-    "sqlalchemy>=2.0.0"
+    "sqlalchemy>=2.0.35"
 ]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b1e20d4 and a1e48e0.

📒 Files selected for processing (3)
  • .github/workflows/deploy.yml (4 hunks)
  • DocsManager/pyproject.toml (2 hunks)
  • RAGManager/pyproject.toml (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Agent
🔇 Additional comments (10)
DocsManager/pyproject.toml (1)

28-28: LGTM - Ruff configuration aligns with Python version requirements.

The target-version lowering to py313 is consistent with the requires-python constraint (>=3.13), and enabling fix=true is a good practice for CI automation.

Also applies to: 32-32

RAGManager/pyproject.toml (2)

2-6: Metadata and version constraint updates are appropriate.

Project name and description are now more explicit, and the Python version constraint with upper bound (>=3.12,<3.14) prevents compatibility issues with future Python versions.


27-27: Ruff configuration aligns with Python version requirements.

The target-version set to py312 is consistent with the updated requires-python constraint, and fix=true follows best practices for automated linting.

Also applies to: 29-29

.github/workflows/deploy.yml (7)

12-12: Timeout configuration is well-layered and appropriate.

The 15-minute job timeout, 5-minute step timeout, and 3-minute deployment timeout create a reasonable cascade that protects against hangs while allowing sufficient time for build and rollout operations.

Also applies to: 17-17


56-56: Provenance attestation disabled for build speed.

The provenance: false setting speeds up container builds by disabling SLSA provenance attestation. This is acceptable for internal services, but ensure this aligns with your security and compliance posture.

Verify that disabling provenance attestation meets your organization's security requirements.


69-84: Deployment strategy improvement: image update + rollout wait.

The shift from restarting deployments to using kubectl set image with explicit rollout status wait is a best-practice improvement. The error handling (pod info dump + explicit exit 1) provides good observability on failure.

The timeout cascade (5-minute step + 3-minute deploy timeout) is defensive and appropriate.


92-106: Automatic rollback with pre-rollback diagnostics is excellent resilience pattern.

The workflow now captures pod logs before performing automatic rollback on failure. This provides both immediate recovery and diagnostic data for post-incident analysis. The error tolerance (|| true) prevents log fetch failures from blocking the rollback.


108-129: Deployment Summary provides good visibility with Markdown formatting.

The conditional status reporting and pod listing in the GitHub Step Summary give clear, actionable deployment status. The use of || true on the kubectl command ensures the summary completes even if pod listing fails.


131-155: Notification job structure is well-designed.

The separate success and failure notification jobs with proper conditions (if: success() and if: failure()) provide clear visibility into deployment outcomes. The failure notification correctly indicates automatic rollback initiation, which aligns with the rollback logic in the main job.


142-143: Hard-coded service URLs are accessible and correctly configured.

Both URLs in the deployment notification are live and returning HTTP 200:

  • https://goland-ia-backend-docs-manager.reto-ucu.net/docs
  • https://goland-ia-backend-rag-manager.reto-ucu.net/docs

No action needed.

@github-actions
Copy link
Copy Markdown

🔍 PR Validation Results

Check Status
Build ❌ failure
Trivy Check Security tab

View detailed results

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a1e48e0 and 2d0f203.

📒 Files selected for processing (2)
  • DocsManager/Dockerfile (1 hunks)
  • RAGManager/Dockerfile (1 hunks)
🧰 Additional context used
🪛 Checkov (3.2.334)
RAGManager/Dockerfile

[low] 1-21: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)


[low] 1-21: Ensure that a user for the container has been created

(CKV_DOCKER_3)

DocsManager/Dockerfile

[low] 1-21: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)


[low] 1-21: Ensure that a user for the container has been created

(CKV_DOCKER_3)

🪛 GitHub Actions: Deploy to Kubernetes
RAGManager/Dockerfile

[error] 14-15: Docker build failed: 'uv' command not found during 'RUN uv python pin 3.12 && uv sync --frozen --no-cache'. Ensure UV is installed and in PATH (e.g., /root/.local/bin is in PATH) before running 'uv'.

DocsManager/Dockerfile

[error] 14-15: Docker build failed: 'uv' command not found during 'RUN uv python pin 3.12 && uv sync --frozen --no-cache'. Ensure UV is installed and in PATH (e.g., /root/.local/bin is in PATH) before running 'uv'.

🪛 GitHub Actions: PR Validation
RAGManager/Dockerfile

[error] 14-15: Command failed: RUN uv python pin 3.12 && uv sync --frozen --no-cache. /bin/sh: 1: uv: not found

DocsManager/Dockerfile

[error] 14-15: Command failed: RUN uv python pin 3.12 && uv sync --frozen --no-cache. /bin/sh: 1: uv: not found

🪛 Hadolint (2.14.0)
RAGManager/Dockerfile

[warning] 5-5: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>

(DL3008)


[info] 5-5: Avoid additional packages by specifying --no-install-recommends

(DL3015)


[warning] 9-9: Set the SHELL option -o pipefail before RUN with a pipe in it. If you are using /bin/sh in an alpine image or if your shell is symlinked to busybox then consider explicitly setting your SHELL to /bin/ash, or disable this check

(DL4006)

DocsManager/Dockerfile

[warning] 5-5: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>

(DL3008)


[info] 5-5: Avoid additional packages by specifying --no-install-recommends

(DL3015)


[warning] 9-9: Set the SHELL option -o pipefail before RUN with a pipe in it. If you are using /bin/sh in an alpine image or if your shell is symlinked to busybox then consider explicitly setting your SHELL to /bin/ash, or disable this check

(DL4006)

Copilot AI review requested due to automatic review settings December 14, 2025 23:56
@github-actions
Copy link
Copy Markdown

🔍 PR Validation Results

Check Status
Build ✅ success
Trivy Check Security tab

View detailed results

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
RAGManager/Dockerfile (2)

5-7: Apply Dockerfile best practices for apt-get installation.

Missing --no-install-recommends flag and version pinning for curl. This increases image size and reduces reproducibility.

 RUN apt-get update && apt-get install -y \
-    curl \
+    curl=7.88.1-10+deb12u* \
     && rm -rf /var/lib/apt/lists/*
+    --no-install-recommends

Note: Verify the exact curl version available in the python:3.12-slim base image's Debian 12 repository, or use a flexible version pinning pattern like curl=7.88.*.


9-9: Add SHELL directive to catch piped command failures.

The piped curl command can silently fail without pipefail, masking issues during build.

+SHELL ["/bin/sh", "-o", "pipefail", "-c"]
 RUN curl -LsSf https://astral.sh/uv/install.sh | sh && \
DocsManager/Dockerfile (2)

5-7: Apply Dockerfile best practices for apt-get installation.

Missing --no-install-recommends flag and version pinning for curl. This increases image size and reduces reproducibility.

 RUN apt-get update && apt-get install -y \
-    curl \
+    curl=7.88.1-10+deb12u* \
     && rm -rf /var/lib/apt/lists/*
+    --no-install-recommends

Note: Verify the exact curl version available in the python:3.12-slim base image's Debian 12 repository.


9-9: Add SHELL directive to catch piped command failures.

The piped curl command can silently fail without pipefail, masking issues during build.

+SHELL ["/bin/sh", "-o", "pipefail", "-c"]
 RUN curl -LsSf https://astral.sh/uv/install.sh | sh && \
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d0f203 and e7fd757.

📒 Files selected for processing (2)
  • DocsManager/Dockerfile (1 hunks)
  • RAGManager/Dockerfile (1 hunks)
🧰 Additional context used
🪛 Checkov (3.2.334)
DocsManager/Dockerfile

[low] 1-25: Ensure that a user for the container has been created

(CKV_DOCKER_3)

RAGManager/Dockerfile

[low] 1-25: Ensure that a user for the container has been created

(CKV_DOCKER_3)

🪛 Hadolint (2.14.0)
DocsManager/Dockerfile

[warning] 5-5: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>

(DL3008)


[info] 5-5: Avoid additional packages by specifying --no-install-recommends

(DL3015)


[warning] 9-9: Set the SHELL option -o pipefail before RUN with a pipe in it. If you are using /bin/sh in an alpine image or if your shell is symlinked to busybox then consider explicitly setting your SHELL to /bin/ash, or disable this check

(DL4006)

RAGManager/Dockerfile

[warning] 5-5: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>

(DL3008)


[info] 5-5: Avoid additional packages by specifying --no-install-recommends

(DL3015)


[warning] 9-9: Set the SHELL option -o pipefail before RUN with a pipe in it. If you are using /bin/sh in an alpine image or if your shell is symlinked to busybox then consider explicitly setting your SHELL to /bin/ash, or disable this check

(DL4006)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-and-deploy (docs-manager, ./DocsManager, reto-xmas-2025-goland-ia-backend-docs-manager, doc...
  • GitHub Check: build-and-deploy (rag-manager, ./RAGManager, reto-xmas-2025-goland-ia-backend-rag-manager, rag-ma...
  • GitHub Check: Agent
🔇 Additional comments (5)
RAGManager/Dockerfile (2)

9-11: Excellent: PATH issue resolved via explicit binary relocation.

The move of UV binaries to /usr/local/bin avoids the previous /root/.cargo/bin misconfiguration entirely. This is a cleaner approach than environment variable manipulation.


22-25: Good: HEALTHCHECK and explicit module path improve reliability.

The HEALTHCHECK endpoint and explicit app.main:app module path (with --no-sync) are solid additions for deployment observability and correctness.

DocsManager/Dockerfile (3)

9-11: Excellent: PATH issue resolved via explicit binary relocation.

The move of UV binaries to /usr/local/bin avoids the previous /root/.cargo/bin misconfiguration. This is a cleaner approach and aligns both services to a consistent UV setup method.


22-25: Good: HEALTHCHECK and explicit module path improve reliability.

The HEALTHCHECK endpoint and explicit app.main:app module path (with --no-sync) are solid additions for deployment observability and correctness. Aligns with RAGManager.


1-1: Verify base image Python version consistency across services.

Base image changed from python:3.13-bookworm-slim to python:3.12-slim, aligning DocsManager with RAGManager. Ensure all dependencies in pyproject.toml are compatible with Python 3.12.

Please confirm that the DocsManager pyproject.toml explicitly declares Python 3.12 compatibility (e.g., python = "^3.12" or similar) to prevent version mismatches at runtime or during dependency resolution.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 10 comments.

Comments suppressed due to low confidence (2)

DocsManager/Dockerfile:11

  • The uv installation script is piped directly to sh without verification. This poses a security risk as it doesn't validate the integrity of the downloaded script. Consider adding checksum verification or using a specific versioned URL. Additionally, downloading binaries from curl and installing them as root without verification is a security concern in production environments.
RUN curl -LsSf https://astral.sh/uv/install.sh | sh && \
    mv /root/.local/bin/uv /usr/local/bin/uv && \
    mv /root/.local/bin/uvx /usr/local/bin/uvx

RAGManager/Dockerfile:11

  • The uv installation script is piped directly to sh without verification. This poses a security risk as it doesn't validate the integrity of the downloaded script. Consider adding checksum verification or using a specific versioned URL. Additionally, downloading binaries from curl and installing them as root without verification is a security concern in production environments.
RUN curl -LsSf https://astral.sh/uv/install.sh | sh && \
    mv /root/.local/bin/uv /usr/local/bin/uv && \
    mv /root/.local/bin/uvx /usr/local/bin/uvx

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Copy Markdown

🔍 PR Validation Results

Check Status
Build ✅ success
Trivy Check Security tab

View detailed results

@github-actions
Copy link
Copy Markdown

🔍 PR Validation Results

Check Status
Build ✅ success
Trivy Check Security tab

View detailed results

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
.github/workflows/deploy.yml (3)

129-129: Fix typo in success notification (unfixed from previous review).

Line 129 still contains "ll Services Deployed!" instead of "All Services Deployed!" This was previously flagged and remains unaddressed.

-        echo "### ll Services Deployed!" >> $GITHUB_STEP_SUMMARY
+        echo "### All Services Deployed!" >> $GITHUB_STEP_SUMMARY

132-133: Parameterize hardcoded service URLs with environment variables.

Lines 132–133 contain hardcoded URLs. For better reusability across environments and easier maintenance if domains change, move these to environment variables.

Add to the env section (line 9–12):

  env:
    REGISTRY: crretoxmas2024.azurecr.io
    NAMESPACE: reto-xmas-2025-goland-ia-backend
    DEPLOY_TIMEOUT: 8m
+   DOCS_MANAGER_URL: https://goland-ia-backend-docs-manager.reto-ucu.net
+   RAG_MANAGER_URL: https://goland-ia-backend-rag-manager.reto-ucu.net

Then update the success notification:

-        echo "- [DocsManager](https://goland-ia-backend-docs-manager.reto-ucu.net/docs)" >> $GITHUB_STEP_SUMMARY
-        echo "- [RAGManager](https://goland-ia-backend-rag-manager.reto-ucu.net/docs)" >> $GITHUB_STEP_SUMMARY
+        echo "- [DocsManager](${{ env.DOCS_MANAGER_URL }}/docs)" >> $GITHUB_STEP_SUMMARY
+        echo "- [RAGManager](${{ env.RAG_MANAGER_URL }}/docs)" >> $GITHUB_STEP_SUMMARY

71-73: Container name hardcoded as api does not match actual service container names and will cause deployment failures.

Line 72 specifies api as the container name for both docs-manager and rag-manager deployments. The actual container names in the services are docs-manager and rag-manager respectively (as shown in compose.yml). The kubectl set image command will fail to find a container named api in either deployment.

Update line 72 to use the service deployment name:

    - name: Update deployment image
      run: |
        kubectl set image deployment/${{ matrix.service.deployment }} \
-         api=${{ env.REGISTRY }}/${{ matrix.service.image }}:${{ github.sha }} \
+         ${{ matrix.service.deployment }}=${{ env.REGISTRY }}/${{ matrix.service.image }}:${{ github.sha }} \
          -n ${{ env.NAMESPACE }}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c73a517 and 20cafba.

📒 Files selected for processing (1)
  • .github/workflows/deploy.yml (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-and-deploy (docs-manager, ./DocsManager, reto-xmas-2025-goland-ia-backend-docs-manager, doc...
  • GitHub Check: build-and-deploy (rag-manager, ./RAGManager, reto-xmas-2025-goland-ia-backend-rag-manager, rag-ma...
🔇 Additional comments (3)
.github/workflows/deploy.yml (3)

12-12: Timeout configuration is well-balanced.

The job timeout (15 minutes) appropriately exceeds the kubectl rollout timeout (8 minutes), providing adequate buffer for other workflow steps and preventing orphaned processes.

Also applies to: 17-17, 78-78


88-96: Good failure logging with defensive fallback.

The "Get logs on failure" step is well-implemented with proper guards, defensive error handling, and sufficient context (all containers, prefix, tail limit).


98-119: Deployment summary is well-structured with good defensive practices.

The summary provides clear per-service information with Markdown formatting, conditional status display, and a defensive fallback for kubectl commands. The if: always() guard ensures the summary is generated even on failure, which is important for diagnostics.

@github-actions
Copy link
Copy Markdown

🔍 PR Validation Results

Check Status
Build ✅ success
Trivy Check Security tab

View detailed results

@github-actions
Copy link
Copy Markdown

🔍 PR Validation Results

Check Status
Build ✅ success
Trivy Check Security tab

View detailed results

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (2)
.github/workflows/deploy.yml (2)

155-157: Parameterize hardcoded service URLs with environment variables.

The live service URLs are hardcoded in the workflow. If domain names or paths change, the workflow must be updated. Parameterize these as repository or organization variables to make the workflow reusable across environments.

Add environment variables at the workflow or job level:

  notify-success:
    name: Deployment Success
    runs-on: ubuntu-latest
    needs: [build-and-deploy]
    if: success()
+   env:
+     DOCS_MANAGER_URL: ${{ vars.DOCS_MANAGER_URL || 'https://goland-ia-backend-docs-manager.reto-ucu.net/docs' }}
+     RAG_MANAGER_URL: ${{ vars.RAG_MANAGER_URL || 'https://goland-ia-backend-rag-manager.reto-ucu.net/docs' }}
    steps:
    - name: Success Summary
      run: |
        echo "### All Services Deployed" >> $GITHUB_STEP_SUMMARY
        echo "" >> $GITHUB_STEP_SUMMARY
        echo "**Live URLs:**" >> $GITHUB_STEP_SUMMARY
-       echo "- [DocsManager](https://goland-ia-backend-docs-manager.reto-ucu.net/docs)" >> $GITHUB_STEP_SUMMARY
-       echo "- [RAGManager](https://goland-ia-backend-rag-manager.reto-ucu.net/docs)" >> $GITHUB_STEP_SUMMARY
+       echo "- [DocsManager](${{ env.DOCS_MANAGER_URL }})" >> $GITHUB_STEP_SUMMARY
+       echo "- [RAGManager](${{ env.RAG_MANAGER_URL }})" >> $GITHUB_STEP_SUMMARY

71-75: Container name must be parameterized from the matrix.

The kubectl set image command hardcodes the container name as api, but your actual containers are named docs-manager and rag-manager (as shown in compose.yml). This will cause the deployment to fail because kubectl cannot find a container named api in those deployments.

Use the matrix service name to parameterize the container name:

    - name: Update deployment image
      run: |
        kubectl set image deployment/${{ matrix.service.deployment }} \
-         api=${{ env.REGISTRY }}/${{ matrix.service.image }}:${{ github.sha }} \
+         ${{ matrix.service.name }}=${{ env.REGISTRY }}/${{ matrix.service.image }}:${{ github.sha }} \
          -n ${{ env.NAMESPACE }}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 20cafba and 9562f67.

📒 Files selected for processing (2)
  • .github/workflows/deploy.yml (5 hunks)
  • .github/workflows/pr-validation.yml (1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.9)
.github/workflows/pr-validation.yml

107-107: "github.event.pull_request.title" is potentially untrusted. avoid using it directly in inline scripts. instead, pass it through an environment variable. see https://docs.github.com/en/actions/reference/security/secure-use#good-practices-for-mitigating-script-injection-attacks for more details

(expression)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: PR Summary
  • GitHub Check: build-and-deploy (docs-manager, ./DocsManager, reto-xmas-2025-goland-ia-backend-docs-manager, doc...
  • GitHub Check: build-and-deploy (rag-manager, ./RAGManager, reto-xmas-2025-goland-ia-backend-rag-manager, rag-ma...
🔇 Additional comments (1)
.github/workflows/deploy.yml (1)

77-104: Robust wait logic is well-implemented.

The multi-layered approach with non-blocking rollout status followed by active replica polling with retries and descriptive failure output provides good observability and resilience. The 15-minute job timeout (line 19) aligns well with the 60-second rollout timeout plus retry logic.

@github-actions
Copy link
Copy Markdown

🔍 PR Validation Results

Check Status
Build ✅ success
Trivy Check Security tab

View detailed results

@brunoalbin23
Copy link
Copy Markdown
Collaborator

@coderabbitai review this pr

@brunoalbin23 brunoalbin23 merged commit 7291469 into main Dec 17, 2025
@brunoalbin23 brunoalbin23 deleted the devops branch December 17, 2025 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants