Feat/rag rmq integration by ignaciovalle20 · Pull Request #55 · ucudal/reto-xmas-2025-goland-ia-backend

ignaciovalle20 · 2025-12-17T22:31:38Z

No description provided.

github-actions · 2025-12-17T22:33:47Z

🔍 PR Validation Results

Check	Status
Build	✅ success
Trivy	Check Security tab

View detailed results

…ocument.process'

Copilot

Pull request overview

This PR integrates RabbitMQ message queue functionality into the RAG Manager service to enable asynchronous PDF processing. The integration allows the service to consume MinIO events from a RabbitMQ queue and automatically process uploaded PDF files through the existing pipeline without requiring direct API calls.

Key Changes:

Added RabbitMQ consumer that processes MinIO S3 bucket events for automatic PDF processing
Integrated the consumer as a daemon thread that starts on application startup
Added configuration management for RabbitMQ connection parameters with URL encoding support

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 14 comments.

Show a summary per file

File	Description
RAGManager/pyproject.toml	Added pika>=1.3.0 dependency for RabbitMQ client support
RAGManager/main.py	Integrated RabbitMQ consumer startup in daemon thread; added health check endpoints
RAGManager/app/workers/pdf_processor_consumer.py	New consumer implementation for processing MinIO events and triggering PDF pipeline
RAGManager/app/workers/init.py	New workers package initialization file
RAGManager/app/core/rabbitmq.py	New RabbitMQ connection handler with connection management and message consumption
RAGManager/app/core/config.py	Added RabbitMQ configuration settings and connection URL property with credential encoding

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-17T23:52:49Z

RAGManager/app/workers/pdf_processor_consumer.py

+    logger.info("Starting PDF processor consumer")
+
+    try:
+        # Create RabbitMQ connection
+        rabbitmq = RabbitMQConnection()
+        rabbitmq.connect()
+
+        # Start consuming messages
+        queue_name = settings.rabbitmq_queue_name
+        logger.info(f"Consuming messages from queue: {queue_name}")
+
+        rabbitmq.consume_messages(
+            queue_name=queue_name,
+            callback=message_callback
+        )
+
+    except KeyboardInterrupt:
+        logger.info("Consumer interrupted by user")
+    except Exception as e:
+        logger.error(f"Fatal error in consumer: {e}", exc_info=True)
+        raise


The start_consumer function does not handle connection failures gracefully after startup. If the RabbitMQ connection drops during operation, the consumer will crash and the exception will be raised in the daemon thread, but there's no mechanism to restart it. Consider implementing automatic reconnection logic with exponential backoff to make the consumer more resilient to network issues.

Copilot · 2025-12-17T23:52:50Z

RAGManager/app/core/rabbitmq.py

+            url = settings.rabbitmq_url
+            logger.info(f"Connecting to RabbitMQ at {settings.rabbitmq_host}:{settings.rabbitmq_port}")
+            logger.debug(
+                f"RabbitMQ URL: amqp://{settings.rabbitmq_user}:***@{settings.rabbitmq_host}:{settings.rabbitmq_port}/"


The RabbitMQ connection credentials are logged in plain debug output which could expose sensitive information if debug logging is enabled in production. The password should be redacted in all log messages. Consider using '***' for all credential fields in the debug output, not just the password in line 26.

Suggested change

f"RabbitMQ URL: amqp://{settings.rabbitmq_user}:***@{settings.rabbitmq_host}:{settings.rabbitmq_port}/"

f"RabbitMQ URL: amqp://***:***@{settings.rabbitmq_host}:{settings.rabbitmq_port}/"

Copilot · 2025-12-17T23:52:50Z

RAGManager/main.py

+        consumer_thread = threading.Thread(target=start_consumer, daemon=True)
+        consumer_thread.start()


The RabbitMQ consumer is started in a daemon thread without any error handling or graceful shutdown mechanism. If the consumer thread crashes or the connection fails after startup, the application will continue running but will silently stop processing messages. Consider storing the thread reference and implementing proper error monitoring or a shutdown event handler to detect and log consumer failures.

Copilot · 2025-12-17T23:52:50Z

RAGManager/main.py

+        consumer_thread = threading.Thread(target=start_consumer, daemon=True)
+        consumer_thread.start()
+        logging.info("RabbitMQ consumer started successfully")


The RabbitMQ connection is not being closed properly on application shutdown. The consumer thread is a daemon thread that will be forcefully terminated when the main process exits, potentially leaving the RabbitMQ connection open. Consider implementing a shutdown event handler that calls rabbitmq.close() to ensure graceful cleanup of resources.

Copilot · 2025-12-17T23:52:50Z

RAGManager/app/core/rabbitmq.py

+        # Set QoS to process one message at a time
+        self.channel.basic_qos(prefetch_count=1)


The QoS setting prefetch_count=1 will process messages sequentially one at a time. While this ensures message ordering and prevents overwhelming the system, it may be inefficient for the PDF processing workload. Consider whether parallel processing of multiple PDFs would be beneficial, and if so, increase the prefetch_count or run multiple consumer instances.

Suggested change

# Set QoS to process one message at a time

self.channel.basic_qos(prefetch_count=1)

# Set QoS prefetch count (default to processing one message at a time)

prefetch_count = getattr(settings, "rabbitmq_prefetch_count", 1)

logger.info(f"Setting RabbitMQ QoS prefetch_count={prefetch_count}")

self.channel.basic_qos(prefetch_count=prefetch_count)

Copilot · 2025-12-17T23:52:52Z

RAGManager/app/workers/pdf_processor_consumer.py

+def message_callback(ch, method, properties, body):
+    """
+    Callback function to process RabbitMQ messages.
+
+    Args:
+        ch: Channel
+        method: Method
+        properties: Properties
+        body: Message body (bytes)


The message callback function has generic parameter names (ch, method, properties, body) that don't provide clear type information. While these are conventional in pika, consider adding type hints to improve code clarity and enable better IDE support. For example: ch: pika.channel.Channel, method: pika.spec.Basic.Deliver, properties: pika.spec.BasicProperties, body: bytes.

Copilot · 2025-12-17T23:52:52Z

RAGManager/app/core/config.py

+    @property
+    def rabbitmq_url(self) -> str:
+        """Returns the RabbitMQ connection URL with URL-encoded credentials."""
+        from urllib.parse import quote_plus


The rabbitmq_url property imports urllib.parse.quote_plus inside the method. This import should be moved to the module level for better performance, as the import will be executed every time the property is accessed. Consider moving it to the top of the file with other imports.

Copilot · 2025-12-17T23:52:52Z

RAGManager/app/workers/pdf_processor_consumer.py

+        document_id = process_pdf_pipeline(pdf_path)
+        logger.info(f"PDF processed successfully: {pdf_path} -> Document ID: {document_id}")
+
+        # Acknowledge the message
+        ch.basic_ack(delivery_tag=method.delivery_tag)
+        logger.info(f"Message acknowledged for: {pdf_path}")


The callback function acknowledges messages even when the PDF processing pipeline only returns a mock document_id (when NotImplementedError is caught). This means messages will be permanently removed from the queue even though the processing is not fully complete. Consider implementing a dead-letter queue or requeuing strategy for partial failures until the pipeline is fully implemented.

Copilot · 2025-12-17T23:52:53Z

RAGManager/app/workers/pdf_processor_consumer.py

+    except Exception as e:
+        logger.error(f"Error processing message: {e}", exc_info=True)
+        # NACK without requeue to avoid infinite loops
+        # In production, consider implementing a dead-letter queue
+        ch.basic_nack(delivery_tag=method.delivery_tag, requeue=False)


The consumer lacks retry logic or circuit breaker patterns. If the process_pdf_pipeline encounters transient failures (e.g., temporary database unavailability), the message will be discarded (NACK without requeue). Consider implementing exponential backoff retries for transient errors before giving up, or using a dead-letter exchange for failed messages that can be retried later.

Copilot · 2025-12-17T23:52:53Z

RAGManager/app/core/rabbitmq.py

@@ -0,0 +1,99 @@
+import json


Import of 'json' is not used.

Suggested change

import json

…' and enhance endpoint validation in MinIO client setup.

agtk-nachov added 3 commits December 16, 2025 21:44

Enhance MinIO and RabbitMQ integration in Docker setup

1ab9ae1

Add RabbitMQ consumer initialization and configuration.

1cadeef

Merge branch 'main' into feat/rag-rmq-integration

f25b79e

ignaciovalle20 requested review from JPAmorin and Copilot and removed request for Copilot December 17, 2025 22:31

Copilot started reviewing on behalf of ignaciovalle20 December 17, 2025 22:32 View session

ignaciovalle20 requested review from Germanadjemian and brunoalbin23 December 17, 2025 22:32

Update RabbitMQ queue name in configuration from 'minio-events' to 'd…

86c2457

…ocument.process'

Copilot AI review requested due to automatic review settings December 17, 2025 23:45

Copilot started reviewing on behalf of ignaciovalle20 December 17, 2025 23:45 View session

Copilot AI reviewed Dec 17, 2025

View reviewed changes

Refactor MinIO configuration: rename 'minio_secure' to 'minio_use_ssl…

b397fdf

…' and enhance endpoint validation in MinIO client setup.

JPAmorin approved these changes Dec 18, 2025

View reviewed changes

JPAmorin merged commit 9b3b121 into main Dec 18, 2025

JPAmorin deleted the feat/rag-rmq-integration branch December 18, 2025 00:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/rag rmq integration#55

Feat/rag rmq integration#55
JPAmorin merged 5 commits intomainfrom
feat/rag-rmq-integration

ignaciovalle20 commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 17, 2025

Uh oh!

Copilot AI Dec 17, 2025

Uh oh!

Copilot AI Dec 17, 2025

Uh oh!

Copilot AI Dec 17, 2025

Uh oh!

Copilot AI Dec 17, 2025

Uh oh!

Copilot AI Dec 17, 2025

Uh oh!

Copilot AI Dec 17, 2025

Uh oh!

Copilot AI Dec 17, 2025

Uh oh!

Copilot AI Dec 17, 2025

Uh oh!

Copilot AI Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	f"RabbitMQ URL: amqp://{settings.rabbitmq_user}:***@{settings.rabbitmq_host}:{settings.rabbitmq_port}/"
	f"RabbitMQ URL: amqp://*:*@{settings.rabbitmq_host}:{settings.rabbitmq_port}/"

		consumer_thread = threading.Thread(target=start_consumer, daemon=True)
		consumer_thread.start()

		# Set QoS to process one message at a time
		self.channel.basic_qos(prefetch_count=1)

-        # Set QoS to process one message at a time
-        self.channel.basic_qos(prefetch_count=1)
+        # Set QoS prefetch count (default to processing one message at a time)
+        prefetch_count = getattr(settings, "rabbitmq_prefetch_count", 1)
+        logger.info(f"Setting RabbitMQ QoS prefetch_count={prefetch_count}")
+        self.channel.basic_qos(prefetch_count=prefetch_count)

Conversation

ignaciovalle20 commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 17, 2025

🔍 PR Validation Results

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants