Skip to content

Separated guard and fallback logic into initial and final.#32

Merged
JPAmorin merged 4 commits intomainfrom
fix/GuardNodeStructure
Dec 15, 2025
Merged

Separated guard and fallback logic into initial and final.#32
JPAmorin merged 4 commits intomainfrom
fix/GuardNodeStructure

Conversation

@JPAmorin
Copy link
Copy Markdown
Collaborator

Separated guard and fallback logic into initial and final reducing coupling.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 14, 2025

📝 Walkthrough

Summary by CodeRabbit

  • Refactor
    • Reworked safety pipeline into two explicit guard stages with clearer forward flow and distinct error/continue paths.
  • New Features
    • Added a final PII-detection guard to flag sensitive data in outputs and surface clear error notices.
    • Deterministic fallback nodes now stop processing and record standardized error messages for malicious or risky content.
  • Chores
    • Added configurable list of PII entity types to tune detection.

✏️ Tip: You can customize this high-level summary in your review settings.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main architectural change: separating guard and fallback logic into initial and final phases, which aligns with the file-level changes.
Description check ✅ Passed The description is directly related to the changeset, explaining the purpose of separating guard/fallback logic into initial/final phases to reduce coupling.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/GuardNodeStructure

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 28a971f and 6cbb41a.

📒 Files selected for processing (3)
  • RAGManager/app/agents/graph.py (4 hunks)
  • RAGManager/app/agents/nodes/__init__.py (1 hunks)
  • RAGManager/app/agents/nodes/guard_inicial.py (5 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
RAGManager/app/agents/nodes/__init__.py (6)
RAGManager/app/agents/nodes/fallback_final.py (1)
  • fallback_final (10-30)
RAGManager/app/agents/nodes/fallback_inicial.py (1)
  • fallback_inicial (10-30)
RAGManager/app/agents/nodes/guard_final.py (1)
  • guard_final (23-74)
RAGManager/app/agents/nodes/guard_inicial.py (1)
  • guard_inicial (24-77)
RAGManager/app/agents/nodes/parafraseo.py (1)
  • parafraseo (10-43)
RAGManager/app/agents/nodes/retriever.py (1)
  • retriever (6-32)
RAGManager/app/agents/nodes/guard_inicial.py (1)
RAGManager/app/agents/state.py (1)
  • AgentState (6-50)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@JPAmorin JPAmorin marked this pull request as draft December 14, 2025 21:27
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
RAGManager/app/agents/graph.py (1)

5-19: Import symbols do not exist in codebase—code will fail with ImportError at runtime.

The imports on lines 5–19 reference symbols that don't exist:

  • guard_inicial and guard_final are imported, but only guard is defined in app.agents.nodes.guard
  • route_after_guard_inicial and route_after_guard_final are imported, but only route_after_guard and route_after_fallback_final are defined in app.agents.routing

Either create the missing node and router functions or correct the imports to use the existing symbols (guard, route_after_guard, route_after_fallback_final).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6a5b4ad and dd8ee5f.

📒 Files selected for processing (1)
  • RAGManager/app/agents/graph.py (4 hunks)
🔇 Additional comments (1)
RAGManager/app/agents/graph.py (1)

27-37: and

Comment thread RAGManager/app/agents/graph.py
Comment thread RAGManager/app/agents/graph.py
@JPAmorin JPAmorin self-assigned this Dec 14, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dd8ee5f and 28a971f.

📒 Files selected for processing (8)
  • RAGManager/app/agents/nodes/__init__.py (1 hunks)
  • RAGManager/app/agents/nodes/fallback_final.py (1 hunks)
  • RAGManager/app/agents/nodes/fallback_inicial.py (2 hunks)
  • RAGManager/app/agents/nodes/guard_final.py (1 hunks)
  • RAGManager/app/agents/nodes/guard_inicial.py (5 hunks)
  • RAGManager/app/agents/routing.py (2 hunks)
  • RAGManager/app/core/config.py (1 hunks)
  • RAGManager/pyproject.toml (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
RAGManager/app/agents/routing.py (1)
RAGManager/app/agents/state.py (1)
  • AgentState (6-49)
RAGManager/app/agents/nodes/guard_final.py (1)
RAGManager/app/agents/state.py (1)
  • AgentState (6-49)
RAGManager/app/agents/nodes/__init__.py (4)
RAGManager/app/agents/nodes/guard_final.py (1)
  • guard_final (23-74)
RAGManager/app/agents/nodes/guard_inicial.py (1)
  • guard_inicial (24-75)
RAGManager/app/agents/nodes/parafraseo.py (1)
  • parafraseo (6-32)
RAGManager/app/agents/nodes/retriever.py (1)
  • retriever (6-32)
RAGManager/app/agents/nodes/guard_inicial.py (1)
RAGManager/app/agents/state.py (1)
  • AgentState (6-49)
RAGManager/app/agents/nodes/fallback_final.py (1)
RAGManager/app/agents/state.py (1)
  • AgentState (6-49)
🔇 Additional comments (8)
RAGManager/app/agents/nodes/fallback_inicial.py (1)

1-1: Good security practice with logging.

The decision to explicitly not log the malicious prompt content is appropriate and helps prevent potential log poisoning, sensitive data exposure, and security audit issues.

Also applies to: 28-28

RAGManager/app/agents/nodes/fallback_final.py (1)

1-30: LGTM! Clean simplification of fallback behavior.

The refactored fallback_final node correctly implements the new design where risky content detection stops processing. The security-conscious logging (not logging response content) and clear error message align well with the separation of concerns in the guard/fallback flow.

RAGManager/app/agents/nodes/__init__.py (1)

8-9: LGTM! Public API correctly updated.

The module exports are properly updated to replace the single guard with guard_inicial and guard_final, aligning with the architectural changes.

RAGManager/app/agents/nodes/guard_inicial.py (1)

1-75: LGTM! Consistent renaming with preserved logic.

The renaming from guard to guard_inicial is applied consistently throughout the file. The jailbreak detection logic, error handling, and security-conscious logging are all preserved correctly.

RAGManager/app/agents/routing.py (1)

6-37: LGTM! Routing functions correctly renamed.

The routing functions are consistently renamed from route_after_guard to route_after_guard_inicial and from route_after_fallback_final to route_after_guard_final. The logic and return values are preserved correctly.

RAGManager/app/agents/nodes/guard_final.py (2)

38-45: LGTM! Empty response handling is correct.

The guard correctly treats empty responses as safe, which is appropriate since there's no content to leak PII.


15-20: No changes needed. The DetectPII initialization is correctly configured with the appropriate parameters (pii_entities and on_fail). DetectPII does not support threshold or device parameters—these are specific to other validators like DetectJailbreak. The current implementation aligns with the guardrails-ai DetectPII 0.6.2 API specification.

RAGManager/pyproject.toml (1)

20-22: Dependency versions are valid and free from known vulnerabilities.

All three specified versions exist on PyPI and contain no known CVEs:

  • guardrails-ai 0.6.2: Free from CVE-2024-6961 and CVE-2024-45858 (both fixed in earlier releases).
  • presidio-analyzer 2.2.360: No known vulnerabilities.
  • presidio-anonymizer 2.2.360: No known vulnerabilities.

Note: guardrails-ai 0.7.1 is available as a newer patch if you wish to consider upgrading for potential improvements.

Comment on lines +10 to +30
def fallback_inicial(state: AgentState) -> AgentState:
"""
Fallback Inicial node - Stops processing when malicious content is detected.

This node:
1. Sets error message indicating that the user's intentions break the chatbot's rules
2. Stops the flow by routing to END

Args:
state: Agent state containing the prompt flagged as malicious

Returns:
Updated state with error_message set, ready to route to END
"""
updated_state = state.copy()

# Set error message for malicious content
updated_state["error_message"] = "The user's intentions break the chatbot's rules."
logger.warning("Malicious content detected. Stopping processing. Prompt content not logged for security.")

return updated_state
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider enhancing docstring with integration context.

The function implementation is clean and focused. To improve maintainability, consider adding a brief note in the docstring about:

  • Which node (guard_inicial) calls this fallback
  • Which routing function (route_after_guard_inicial) handles the error_message

This would help future maintainers understand the flow without searching the codebase.

Example enhancement:

 def fallback_inicial(state: AgentState) -> AgentState:
     """
     Fallback Inicial node - Stops processing when malicious content is detected.
+    
+    Called by: guard_inicial when malicious content is detected
+    Routing: route_after_guard_inicial routes to END when error_message is set

     This node:
     1. Sets error message indicating that the user's intentions break the chatbot's rules
     2. Stops the flow by routing to END
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def fallback_inicial(state: AgentState) -> AgentState:
"""
Fallback Inicial node - Stops processing when malicious content is detected.
This node:
1. Sets error message indicating that the user's intentions break the chatbot's rules
2. Stops the flow by routing to END
Args:
state: Agent state containing the prompt flagged as malicious
Returns:
Updated state with error_message set, ready to route to END
"""
updated_state = state.copy()
# Set error message for malicious content
updated_state["error_message"] = "The user's intentions break the chatbot's rules."
logger.warning("Malicious content detected. Stopping processing. Prompt content not logged for security.")
return updated_state
def fallback_inicial(state: AgentState) -> AgentState:
"""
Fallback Inicial node - Stops processing when malicious content is detected.
Called by: guard_inicial when malicious content is detected
Routing: route_after_guard_inicial routes to END when error_message is set
This node:
1. Sets error message indicating that the user's intentions break the chatbot's rules
2. Stops the flow by routing to END
Args:
state: Agent state containing the prompt flagged as malicious
Returns:
Updated state with error_message set, ready to route to END
"""
updated_state = state.copy()
# Set error message for malicious content
updated_state["error_message"] = "The user's intentions break the chatbot's rules."
logger.warning("Malicious content detected. Stopping processing. Prompt content not logged for security.")
return updated_state
🤖 Prompt for AI Agents
In RAGManager/app/agents/nodes/fallback_inicial.py around lines 10 to 30, the
docstring is missing integration context; update it to include a brief note
stating that this fallback is invoked by the guard_inicial node and that routing
of the updated state (with error_message) is handled by
route_after_guard_inicial, so maintainers can understand the control flow
without searching the codebase; keep the existing implementation unchanged and
add a short, one- or two-sentence mention of the caller and routing function in
the docstring.

Comment on lines +26 to +27
# Set error message for malicious content
updated_state["error_message"] = "The user's intentions break the chatbot's rules."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify routing mechanism and error message integration

# First, examine the file in question
echo "=== Content of fallback_inicial.py ==="
cat -n RAGManager/app/agents/nodes/fallback_inicial.py

Repository: ucudal/reto-xmas-2025-goland-ia-backend

Length of output: 1280


🏁 Script executed:

#!/bin/bash
# Search for routing logic that checks error_message
echo "=== Searching for routing logic that checks error_message ==="
rg -n -C3 'error_message' --type=py | head -100

Repository: ucudal/reto-xmas-2025-goland-ia-backend

Length of output: 7845


🏁 Script executed:

#!/bin/bash
# Search for route_after_guard_inicial implementation
echo "=== Searching for route_after_guard_inicial ==="
rg -n 'route_after_guard_inicial' --type=py -A 20 | head -150

Repository: ucudal/reto-xmas-2025-goland-ia-backend

Length of output: 4308


🏁 Script executed:

#!/bin/bash
# Search for guard_inicial references and how fallback_inicial integrates
echo "=== Searching for guard_inicial integration ==="
rg -n 'guard_inicial|fallback_inicial' --type=py -B 2 -A 2 | head -150

Repository: ucudal/reto-xmas-2025-goland-ia-backend

Length of output: 6408


🏁 Script executed:

#!/bin/bash
# Find the graph/workflow definition to understand routing
echo "=== Searching for graph or workflow setup ==="
fd -e py | xargs rg -l 'add_node.*fallback_inicial|fallback_inicial.*add_node' | head -5

Repository: ucudal/reto-xmas-2025-goland-ia-backend

Length of output: 159


The error message is overly generic and overwrites more specific violation context.

The routing mechanism is explicit and correct—route_after_guard_inicial checks the is_malicious flag (set by guard_inicial) to decide whether to route to fallback_inicial, which then sets the error message before routing to END.

However, fallback_inicial sets a generic message that overwrites the more specific message from guard_inicial ("Jailbreak attempt detected. Your request contains content that violates security policies."). Consider whether fallback_inicial should:

  • Preserve the error message set by guard_inicial instead of replacing it
  • Pass through the original message or add context about the detection type
🤖 Prompt for AI Agents
In RAGManager/app/agents/nodes/fallback_inicial.py around lines 26-27,
fallback_inicial currently unconditionally overwrites any existing error_message
with a generic string; change it to preserve the more specific message from
guard_inicial by only setting the generic message when no error_message exists
(or append minimal context instead of replacing), e.g., check for an existing
updated_state["error_message"] and leave it unchanged if present, otherwise set
the fallback message (or concatenate a short indicator like " (fallback
detected)" to the existing message).

Comment on lines +66 to +72
except Exception as e:
# If validation fails due to error, log it but don't block the request
# This is a safety measure - if Guardrails fails, we allow the request
# but log the error for monitoring
logger.error(f"Error during PII detection: {e}")
updated_state["is_risky"] = False
updated_state["error_message"] = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reconsider fail-safe behavior for PII detection.

The exception handler allows requests to proceed when validation fails due to an error (sets is_risky = False). While this prevents blocking legitimate requests when Guardrails fails, it creates a security risk: if PII detection fails silently, sensitive information could be leaked to users.

Consider:

  1. Failing closed (blocking the request) when PII detection errors occur, especially in production
  2. Adding metrics/alerting to detect when PII validation is failing
  3. Using a circuit breaker pattern to degrade gracefully during persistent failures

This contrasts with guard_inicial where failing open (allowing jailbreak attempts through) is more appropriate since false positives are more problematic than false negatives for user input validation.

     except Exception as e:
         # If validation fails due to error, log it but don't block the request
-        # This is a safety measure - if Guardrails fails, we allow the request
+        # SECURITY: Consider failing closed for PII detection in production
         # but log the error for monitoring
         logger.error(f"Error during PII detection: {e}")
-        updated_state["is_risky"] = False
-        updated_state["error_message"] = None
+        # For production: fail closed to prevent PII leakage
+        updated_state["is_risky"] = True
+        updated_state["error_message"] = (
+            "Unable to validate response safety. Please try again later."
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e:
# If validation fails due to error, log it but don't block the request
# This is a safety measure - if Guardrails fails, we allow the request
# but log the error for monitoring
logger.error(f"Error during PII detection: {e}")
updated_state["is_risky"] = False
updated_state["error_message"] = None
except Exception as e:
# If validation fails due to error, log it but don't block the request
# SECURITY: Consider failing closed for PII detection in production
# but log the error for monitoring
logger.error(f"Error during PII detection: {e}")
# For production: fail closed to prevent PII leakage
updated_state["is_risky"] = True
updated_state["error_message"] = (
"Unable to validate response safety. Please try again later."
)

Comment on lines +50 to +62
guardrails_pii_entities: list[str] = Field(
default=[
"EMAIL_ADDRESS",
"PHONE_NUMBER",
"CREDIT_CARD",
"SSN",
"US_PASSPORT",
"US_DRIVER_LICENSE",
"IBAN_CODE",
"IP_ADDRESS",
],
description="List of PII entity types to detect using DetectPII validator.",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

guardrails-ai DetectPII validator version 0.6.2 valid entity types

💡 Result:

Here are the valid PII entity types supported by the Guardrails DetectPII/GuardrailsPII validator (as exposed in the validator hub / repo documentation). [1][2]

  • CREDIT_CARD
  • CRYPTO
  • DATE_TIME
  • EMAIL_ADDRESS
  • IBAN_CODE
  • IP_ADDRESS
  • NRP
  • LOCATION
  • PERSON
  • PHONE_NUMBER
  • MEDICAL_LICENSE
  • URL
  • US_BANK_NUMBER
  • US_DRIVER_LICENSE
  • US_ITIN
  • US_PASSPORT
  • US_SSN
  • UK_NHS
  • ES_NIF
  • ES_NIE
  • IT_FISCAL_CODE
  • IT_DRIVER_LICENSE
  • IT_VAT_CODE
  • IT_PASSPORT
  • IT_IDENTITY_CARD
  • PL_PESEL
  • SG_NRIC_FIN
  • SG_UEN
  • AU_ABN
  • AU_ACN
  • AU_TFN
  • AU_MEDICARE
  • IN_PAN
  • IN_AADHAAR
  • IN_VEHICLE_REGISTRATION
  • IN_VOTER
  • IN_PASSPORT
  • FI_PERSONAL_IDENTITY_CODE

Sources: Guardrails PII validator docs and DetectPII README.
[1] Guardrails PII validator (guardrails-ai/guardrails_pii).
[2] Guardrails DetectPII (guardrails-ai/detect_pii).


🌐 Web query:

presidio-analyzer 2.2.360 supported PII entity types

💡 Result:

Here are the PII entity types that Presidio Analyzer supports (as listed in the Presidio docs). Global entities, then country/region-specific entities:

Global:
CREDIT_CARD, CRYPTO, DATE_TIME, EMAIL_ADDRESS, IBAN_CODE, IP_ADDRESS, NRP, LOCATION, PERSON, PHONE_NUMBER, MEDICAL_LICENSE, URL. [1]

USA:
US_BANK_NUMBER, US_DRIVER_LICENSE, US_ITIN, US_PASSPORT, US_SSN. [1]

UK:
UK_NHS, UK_NINO. [1]

Spain:
ES_NIF, ES_NIE. [1]

Italy:
IT_FISCAL_CODE, IT_DRIVER_LICENSE, IT_VAT_CODE, IT_PASSPORT, IT_IDENTITY_CARD. [1]

Poland:
PL_PESEL. [1]

Singapore:
SG_NRIC_FIN, SG_UEN. [1]

Australia:
AU_ABN, AU_ACN, AU_TFN, AU_MEDICARE. [1]

India:
IN_PAN, IN_AADHAAR, IN_VEHICLE_REGISTRATION, IN_VOTER, IN_PASSPORT, IN_GSTIN. [1]

Finland:
FI_PERSONAL_IDENTITY_CODE. [1]

Korea:
KR_RRN. [1]

Thai:
TH_TNIN. [1]

For how to retrieve the runtime list programmatically (AnalyzerEngine.get_supported_entities), see the Analyzer API reference. [2]

Sources:
[1] Presidio — Supported entities documentation. [2] Presidio Analyzer Python API (get_supported_entities).


Change SSN to US_SSN in the default PII entity types list.

The entity type SSN is not recognized by guardrails-ai DetectPII validator and presidio-analyzer. The correct entity type is US_SSN. All other current entity types in the list (EMAIL_ADDRESS, PHONE_NUMBER, CREDIT_CARD, US_PASSPORT, US_DRIVER_LICENSE, IBAN_CODE, IP_ADDRESS) are valid.

Additionally, consider including PERSON, LOCATION, DATE_TIME, US_BANK_NUMBER, MEDICAL_LICENSE, and CRYPTO in the default list, as these are all valid and commonly used PII entity types supported by presidio-analyzer.

🤖 Prompt for AI Agents
In RAGManager/app/core/config.py around lines 50 to 62, the default PII entity
list contains "SSN" which is not recognized by the DetectPII validator; replace
"SSN" with "US_SSN" and update the default list to optionally include additional
commonly used presidio-supported entities such as "PERSON", "LOCATION",
"DATE_TIME", "US_BANK_NUMBER", "MEDICAL_LICENSE", and "CRYPTO" if you want
broader detection; ensure the list items remain string literals and the Field
description remains unchanged.

@JPAmorin
Copy link
Copy Markdown
Collaborator Author

@JuanFKurucz I've divided guard and fallback logic into two due to different things being accounted for when processing. First check is for malicious intent, therefore guard_inicial checks for Jailbreak attempts (functionality can be modified or expanded if fit) and guard_final checks for PII information (again, functionality can be modified or expanded if necessary). When fallback nodes are called I used static responses instead of LLM calls, with LLM call's cost in mind. If we want the LLM calls I can revert this change.

@JPAmorin JPAmorin marked this pull request as ready for review December 15, 2025 22:23
@github-actions
Copy link
Copy Markdown

🔍 PR Validation Results

Check Status
Build ✅ success
Trivy Check Security tab

View detailed results

@JPAmorin JPAmorin requested review from brunoalbin23 and removed request for Germanadjemian and JuanFKurucz December 15, 2025 22:28
@JPAmorin JPAmorin merged commit 1dbcbe5 into main Dec 15, 2025
4 checks passed
@JPAmorin JPAmorin deleted the fix/GuardNodeStructure branch December 15, 2025 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants