Skip to content

Devops cositas#5

Merged
Locatelli-Flor merged 9 commits intomainfrom
devops
Dec 14, 2025
Merged

Devops cositas#5
Locatelli-Flor merged 9 commits intomainfrom
devops

Conversation

@Locatelli-Flor
Copy link
Copy Markdown
Contributor

This pull request sets up the foundational infrastructure for a Python API project with a PostgreSQL database (with pgvector extension), including containerization and initial database schema for both document storage (RAG vector store) and chat functionality. The changes introduce Docker and Docker Compose configuration files, and a SQL script to initialize the database schema with the necessary tables and extensions.

Infrastructure and containerization:

  • Added a Dockerfile that defines a Python 3.13-based container for running the API using uvicorn, installs dependencies with uv, and exposes port 8000.
  • Introduced a compose.yml file to orchestrate two services: the API (built from the Dockerfile) and a PostgreSQL database with the pgvector extension, including persistent volumes, healthchecks, and environment variable management.

Database schema and initialization:

  • Added db-init/01-initial-setup.sql to initialize the database with required extensions (pgvector, uuid-ossp) and create tables for document storage (documents and document_chunks with vector embeddings and HNSW index) and chat (chat_sessions with UUIDs and chat_messages with sender type).

Sets up Docker Compose for local development.

Defines services for the API and database (PostgreSQL with pgvector).
Configures environment variables, volumes, and health checks for both services.
Also includes a Dockerfile that uses uv to manage the python environment and dependencies.
Copilot AI review requested due to automatic review settings December 11, 2025 23:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request establishes the foundational DevOps infrastructure for a Python-based RAG (Retrieval-Augmented Generation) API with chat functionality. It introduces containerization using Docker and Docker Compose, sets up a PostgreSQL database with pgvector extension for vector similarity search, and defines the initial database schema for document storage and chat sessions.

Key Changes

  • Container orchestration with Docker Compose for API service and PostgreSQL database with pgvector support
  • Database schema initialization with tables for document chunking with vector embeddings (using HNSW indexing) and chat session management
  • Docker configuration using Python 3.13 base image with uv package manager and uvicorn server

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

File Description
Dockerfile Defines Python container image with uv package manager and uvicorn server configuration
compose.yml Orchestrates API and PostgreSQL services with health checks, volume persistence, and environment configuration
db-init/01-initial-setup.sql Initializes database with pgvector extension, RAG vector store tables (documents/chunks), and chat functionality tables (sessions/messages)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@JuanFKurucz
Copy link
Copy Markdown
Member

@CodeRabbit review this pr

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 12, 2025

@JuanFKurucz: I'll review the changes in this pull request for you.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 12, 2025

Warning

Rate limit exceeded

@Locatelli-Flor has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 10 minutes and 10 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 211feaa and 7dcbb38.

📒 Files selected for processing (1)
  • .github/workflows/deploy.yml (1 hunks)

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds docker-compose services (docs-manager, rag-manager, db, rabbitmq), an initial Postgres schema with pgvector and chat tables, Dockerfiles for both managers, a Python version tweak, and two GitHub Actions workflows for deploy and PR validation.

Changes

Cohort / File(s) Change Summary
Compose & Services
compose.yml
Adds docker-compose config for docs-manager, rag-manager, db (pgvector image), and rabbitmq with build/image, container names, ports, env files, volumes, healthchecks, and service dependencies (service_healthy)
Database initialization
db-init/01-initial-setup.sql
Adds SQL to enable vector and uuid-ossp extensions; creates documents, document_chunks (with embedding vector(1536), HNSW index), chat_sessions, chat_messages, related types, constraints, and indexes
Application containers
DocsManager/Dockerfile, RAGManager/Dockerfile
Adds Dockerfiles using Python/uv base image: set workdir, copy pyproject.toml and uv.lock*, run uv sync (with --frozen --no-cache fallback), copy project, expose port 8000, and run uvicorn main:app
Runtime config
RAGManager/.python-version
Updates Python version string from 3.14 to 3.12
CI/CD workflows
.github/workflows/deploy.yml, .github/workflows/pr-validation.yml
Adds deploy workflow (matrix build/push, ACR login, kubectl rollout restart/wait) and PR validation workflow (matrix build, Trivy vulnerability scans with SARIF upload, and a PR summary comment)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Pay attention to:
    • compose.yml: service dependency ordering, healthcheck commands and timeouts, volume mounts.
    • db-init/01-initial-setup.sql: embedding dimension (1536), HNSW index params, constraints, cascade behaviors, and extension enabling.
    • Dockerfiles: dependency install command patterns and fallback behavior (uv sync flags).
    • GitHub Actions: build/push tags, secret usage for kubeconfig/ACR, rollout wait timeouts, and SARIF upload steps.

Poem

🐰
I hopped through compose with a cheerful beat,
vectors and chunks lined up neat.
Two managers snug in container beds,
RabbitMQ and Postgres nodding heads.
Tiny paws, big changes — rollouts complete!

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Devops cositas' is vague and non-descriptive. While it relates to the DevOps infrastructure changes in the PR, it lacks clarity and specificity about what is actually being added or configured. Use a more specific and descriptive title such as 'Add Docker Compose and database initialization for PostgreSQL with pgvector' to clearly convey the main infrastructure changes.
✅ Passed checks (2 passed)
Check name Status Explanation
Description check ✅ Passed The pull request description is comprehensive and directly related to the changeset, clearly explaining the infrastructure setup, containerization changes, and database schema initialization with specific details about services, extensions, and tables.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (6)
Dockerfile (3)

1-1: Pin the base image by digest (supply-chain hardening).

ghcr.io/astral-sh/uv:python3.13-bookworm-slim is a mutable tag; please pin to an immutable digest and update deliberately.


1-1: Verify Python version compatibility with pyproject.toml.

If pyproject.toml requires >=3.14, this python3.13 base image will break installs/runtime. Please verify and align the image tag accordingly.


13-13: Don’t hardcode --reload in the container command.

--reload is development-only; make it environment-driven (or split dev/prod targets).

-CMD ["uv", "run", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--reload"]
+CMD ["sh","-c","uv run uvicorn main:app --host 0.0.0.0 --port 8000 ${UVICORN_RELOAD:+--reload}"]
compose.yml (2)

8-10: Don’t ship the source bind-mount as the default “prod-like” compose.

- .:/app is great for local dev, but it undermines image immutability and can create confusing runtime drift. Consider a compose.override.yml (or profiles) for dev-only mounts.


25-27: Avoid exposing Postgres to the host unless required.

Remove ports: "5432:5432" for safer defaults; keep DB reachable via the internal compose network (or gate it behind a dev profile).

db-init/01-initial-setup.sql (1)

54-55: Fix invalid CREATE TYPE IF NOT EXISTS usage (will fail on re-runs).

PostgreSQL doesn’t support IF NOT EXISTS for CREATE TYPE in this form; initialization can break if the type already exists. Use a guarded DO block:

-CREATE TYPE IF NOT EXISTS sender_type AS ENUM ('user', 'assistant', 'system');
+DO $$
+BEGIN
+  IF NOT EXISTS (SELECT 1 FROM pg_type WHERE typname = 'sender_type') THEN
+    CREATE TYPE sender_type AS ENUM ('user', 'assistant', 'system');
+  END IF;
+END
+$$;
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81ea719 and 89f9ff2.

📒 Files selected for processing (3)
  • Dockerfile (1 hunks)
  • compose.yml (1 hunks)
  • db-init/01-initial-setup.sql (1 hunks)
🧰 Additional context used
🪛 Checkov (3.2.334)
Dockerfile

[low] 1-13: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)


[low] 1-13: Ensure that a user for the container has been created

(CKV_DOCKER_3)

🔇 Additional comments (1)
compose.yml (1)

10-13: depends_on.condition: service_healthy portability caveat—verify your target runtime.

This works with Docker Compose implementations, but isn’t supported in Swarm; if you ever deploy beyond local compose, add app-level retry/backoff on DB connections too.

…Manager and RAGManager, update docker-compose.yml for service configuration, and adjust Python version in RAGManager.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (4)
DocsManager/Dockerfile (2)

1-1: Python version mismatch with pyproject.toml requirements.

As noted in a previous review, this Dockerfile uses Python 3.13 while pyproject.toml requires Python >=3.14. This will cause runtime compatibility issues.

Apply this diff to use Python 3.14:

-FROM ghcr.io/astral-sh/uv:python3.13-bookworm-slim
+FROM ghcr.io/astral-sh/uv:python3.14-bookworm-slim

1-1: Pin base image to immutable digest for supply-chain security.

As flagged in a previous review, using the mutable tag python3.13-bookworm-slim creates a supply-chain risk. If the upstream image is compromised or retagged, builds could pull malicious content.

Pin to a specific digest by first obtaining it:

#!/bin/bash
# Description: Get the current digest for the base image

docker pull ghcr.io/astral-sh/uv:python3.14-bookworm-slim
docker inspect ghcr.io/astral-sh/uv:python3.14-bookworm-slim --format='{{index .RepoDigests 0}}'

Then update the Dockerfile to use the digest format:

-FROM ghcr.io/astral-sh/uv:python3.13-bookworm-slim
+FROM ghcr.io/astral-sh/uv@sha256:<digest>
compose.yml (2)

43-43: Pin pgvector image to immutable digest for supply-chain security.

As noted in a previous review, using the mutable tag pg16 allows the image to be silently updated, breaking reproducibility and introducing supply-chain risks.

Obtain the digest and pin the image:

#!/bin/bash
# Description: Get the current digest for the pgvector image

docker pull pgvector/pgvector:pg16
docker inspect pgvector/pgvector:pg16 --format='{{index .RepoDigests 0}}'

Then update line 43:

-    image: pgvector/pgvector:pg16
+    image: pgvector/pgvector@sha256:<digest>

51-52: Database port exposure is a security risk.

As flagged in a previous review, exposing port 5432 to the host makes the database accessible outside the Docker network. In production, the database should only be accessible through the internal network.

If external access is not required, remove the port mapping:

-    ports:
-      - "5432:5432"

The application services (docs-manager, rag-manager) can still access the database via the internal Docker network using the service name db:5432.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 89f9ff2 and 0bd3a5e.

📒 Files selected for processing (4)
  • DocsManager/Dockerfile (1 hunks)
  • RAGManager/.python-version (1 hunks)
  • RAGManager/Dockerfile (1 hunks)
  • compose.yml (1 hunks)
🧰 Additional context used
🪛 Checkov (3.2.334)
RAGManager/Dockerfile

[low] 1-13: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)


[low] 1-13: Ensure that a user for the container has been created

(CKV_DOCKER_3)

DocsManager/Dockerfile

[low] 1-13: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)


[low] 1-13: Ensure that a user for the container has been created

(CKV_DOCKER_3)

🔇 Additional comments (3)
compose.yml (3)

66-68: Consider security implications of exposing RabbitMQ management port.

Port 15672 exposes the RabbitMQ management interface to the host. If this is a production or shared environment, unauthorized users could access queue metrics, configuration, and potentially sensitive message data.

If the management interface is only needed for local development:

  1. Option 1: Remove the management port exposure entirely and use rabbitmq:3.13-alpine (without management plugin).
  2. Option 2: Keep it but ensure proper authentication and consider restricting access via firewall rules or network policies in production.
     ports:
       - "5672:5672"
-      - "15672:15672"

57-61: LGTM: Well-configured healthcheck for database service.

The healthcheck correctly uses pg_isready to verify the database is accepting connections, with appropriate intervals and retries. This ensures dependent services (docs-manager, rag-manager) only start after the database is ready.


77-81: LGTM: Well-configured healthcheck for RabbitMQ service.

The healthcheck appropriately uses rabbitmq-diagnostics ping with reasonable intervals and retries, ensuring the rag-manager service starts only after RabbitMQ is ready.

Sets up GitHub Actions workflows for continuous integration and continuous deployment.

- Introduces a deployment workflow that builds and pushes Docker images to ACR, configures kubectl, and restarts deployments in a Kubernetes namespace.
- Implements a pull request validation workflow that performs secret scanning with Gitleaks, builds Docker images for validation (without pushing), runs Trivy vulnerability scans, and uploads the results to GitHub Security.
- Adds a PR summary workflow that posts a comment on the pull request with the results of the Gitleaks and build validation jobs, including a notice to check the security tab for any found vulnerabilities.
Copilot AI review requested due to automatic review settings December 13, 2025 14:50
@github-actions
Copy link
Copy Markdown

PR Validation Results

  • Gitleaks: Failed
  • Build: Failed
  • Trivy: Check security tab for vulnerabilities

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0bd3a5e and 38a24d6.

📒 Files selected for processing (2)
  • .github/workflows/deploy.yml (1 hunks)
  • .github/workflows/pr-validation.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Agent
🔇 Additional comments (5)
.github/workflows/deploy.yml (2)

1-27: Well-structured matrix strategy for multi-service deployment.

The matrix setup allows clean scaling of the workflow to additional services without code duplication. Job structure and environment variables are appropriately defined.


35-52: ACR authentication and Docker build configuration looks sound.

The build strategy with latest/sha tagging and registry-based caching is appropriate. Note: if you plan to support multiple architectures (arm64), you may want to extend the platforms matrix.

.github/workflows/pr-validation.yml (3)

14-26: Gitleaks configuration is sound.

Secret scanning with full git history is appropriate for pre-merge validation.


45-55: Docker build without push is appropriate for PR validation.

Local build avoids unnecessary registry traffic while still enabling security scanning.


65-70: SARIF upload logic depends on critical fix above.

This step assumes the Trivy scan produced a SARIF file. Once the image reference issue is fixed (allowing Trivy to scan successfully), this step will work correctly.

Streamlines the PR validation workflow by removing the Gitleaks job and improving the presentation of Trivy results.

The workflow now focuses on build validation and vulnerability scanning with clearer output in the PR summary. Trivy results are now displayed in a table format within the PR comment, and a direct link to the detailed results in the Actions tab is included. The Gitleaks check is removed.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +21 to +26
image: reto-xmas-2025-goland-ia-backend-docs-manager
deployment: docs-manager
- name: rag-manager
path: ./RAGManager
image: reto-xmas-2025-goland-ia-backend-rag-manager
deployment: rag-manager
Copy link

Copilot AI Dec 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image name contains "goland" which appears to be a typo for "golang". GoLand is a JetBrains IDE, while Golang (or Go) is the programming language. If this is meant to reference the Go language, it should be corrected to "golang".

Copilot uses AI. Check for mistakes.
Comment on lines +73 to +76
if: always()
steps:
- name: PR Comment
uses: actions/github-script@v7
Copy link

Copilot AI Dec 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Trivy scan in the "Print Trivy results" step attempts to scan an image that was built with "push: false" on line 53, meaning the image only exists in the local buildx cache and is not available to the separate docker run command. This will cause the step to fail because the image cannot be found. Either enable pushing to a temporary registry or use the Trivy action's scan-type: 'fs' to scan the filesystem directly.

Copilot uses AI. Check for mistakes.
@github-advanced-security
Copy link
Copy Markdown

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

@github-actions
Copy link
Copy Markdown

🔍 PR Validation Results

Check Status
Build ✅ success
Trivy Check Security tab

View detailed results

Adds deployment summary to the workflow, providing detailed information about the deployed service, image, and pod status in the job summary.

Also, it includes a success notification with links to deployed services and sets fail-fast to false to ensure all services are deployed.
@github-actions
Copy link
Copy Markdown

🔍 PR Validation Results

Check Status
Build ✅ success
Trivy Check Security tab

View detailed results

@Locatelli-Flor Locatelli-Flor merged commit b1e20d4 into main Dec 14, 2025
4 of 5 checks passed
@Locatelli-Flor Locatelli-Flor deleted the devops branch December 14, 2025 23:09
@coderabbitai coderabbitai bot mentioned this pull request Dec 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants