Add French language support for EU AI Act Article 5 guardrail#21427
Add French language support for EU AI Act Article 5 guardrail#21427ishaan-jaff merged 6 commits intomainfrom
Conversation
- Create eu_ai_act_article5_fr.yaml with comprehensive French keywords - Includes identifier words: concevoir, créer, développer, noter, classer, etc. - Includes block words: crédit social, comportement social, émotion des employés, etc. - Includes always-block keywords for explicit prohibited practices - Includes exceptions for research, compliance, and legitimate use cases - Catches circumvention attempts with phrase variations
- Test 3 critical scenarios: blocked query, circumvention attempt, safe query - Test edge cases: case-insensitive, mixed language, research exceptions - All 7 tests passing - Validates both blocking and allowing behavior
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
b77f2e2 to
bfee545
Compare
Greptile SummaryThis PR adds French language support for the EU AI Act Article 5 prohibited practices guardrail, following the same template pattern as the existing English version. It introduces a French YAML policy template with
Confidence Score: 2/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/guardrails/guardrail_hooks/litellm_content_filter/policy_templates/eu_ai_act_article5_fr.yaml | French EU AI Act policy template. always_block_keywords and exceptions work correctly, but identifier_words and additional_block_words are never loaded due to missing inherit_from (pre-existing pattern from English template). ~50% of the YAML config is effectively dead. |
| tests/guardrails_tests/test_eu_ai_act_french_3_scenarios.py | Test file covers always_block_keywords and exception scenarios. However, no tests verify conditional matching (identifier + block word), and one test method (test_all_scenarios_summary) is a no-op with no assertions. |
| tests/guardrails_tests/FRENCH_TEST_SUMMARY.md | Documentation file describing test results. Claims conditional matching works (e.g., "Conditional: créer + comportement social") which is misleading since that logic path is never activated. |
Flowchart
flowchart TD
A[User Input - French Text] --> B{Load YAML Template}
B --> C[always_block_keywords ✅ Loaded]
B --> D[identifier_words ❌ Not Loaded]
B --> E[additional_block_words ❌ Not Loaded]
B --> F[exceptions ✅ Loaded]
A --> G{Check Exceptions}
G -->|Exception Found| H[✅ ALLOW]
G -->|No Exception| I{Check always_block_keywords}
I -->|Match Found| J[🚫 BLOCK - HTTPException 403]
I -->|No Match| K{Check Conditional Categories}
K -->|Empty - Never Loaded| L[✅ ALLOW - Falls Through]
style D fill:#ff9999,stroke:#cc0000
style E fill:#ff9999,stroke:#cc0000
style C fill:#99ff99,stroke:#00cc00
style F fill:#99ff99,stroke:#00cc00
style L fill:#ffff99,stroke:#cccc00
Last reviewed commit: bfee545
- Enable conditional matching when identifier_words + additional_block_words are present - Previously required inherit_from, but EU AI Act templates are self-contained - Fixes Greptile feedback: conditional matching now works as documented
- Test identifier + block word combinations not in always_block_keywords - Verifies conditional matching works independently - Addresses Greptile feedback about test coverage gap
|
@greptile review again |
Greptile SummaryThis PR adds French language support for the EU AI Act Article 5 guardrail by introducing a new YAML policy template (
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/guardrails/guardrail_hooks/litellm_content_filter/content_filter.py | Extends _load_conditional_category to support additional_block_words without inherit_from. Logic is clean and well-structured with proper logging. |
| litellm/proxy/guardrails/guardrail_hooks/litellm_content_filter/policy_templates/eu_ai_act_article5_fr.yaml | New French YAML template with identifier_words, additional_block_words, always_block_keywords, and exceptions. Some exception words are overly broad for substring matching. |
| tests/guardrails_tests/test_eu_ai_act_french_3_scenarios.py | Tests cover core blocking and allow scenarios plus a conditional matching test. Excessive print statements add noise. |
Flowchart
flowchart TD
A[Input Text] --> B{Check Exceptions<br/>substring match}
B -->|Exception found| C[ALLOW - skip all checks]
B -->|No exception| D{Check Conditional Categories<br/>identifier_word + block_word}
D -->|Match found| E[BLOCK - conditional match]
D -->|No match| F{Check always_block_keywords<br/>exact phrase match}
F -->|Match found| G[BLOCK - always_block keyword]
F -->|No match| H{Check category_keywords<br/>word boundary match}
H -->|Match found| I[BLOCK - category keyword]
H -->|No match| J[ALLOW - no violations]
style E fill:#f66,color:#fff
style G fill:#f66,color:#fff
style I fill:#f66,color:#fff
style C fill:#6f6,color:#000
style J fill:#6f6,color:#000
Last reviewed commit: adece0d
- Replace short words (film, jeu, juste) with context-specific phrases - Prevents substring matching bypasses (e.g., enjeu matching jeu) - Add tests for bypass prevention and legitimate game context - Addresses Greptile security feedback
|
Fixed the exception bypass risk identified in the latest Greptile review. Problem:
This created bypass opportunities where prohibited queries could slip through. Fix:
Verified:
|
|
@greptile-apps please review the latest commit c3d1504 - I've addressed the exception bypass risk |
- Use getattr to safely access exception detail field - Check if detail is dict before calling .get() - Addresses Greptile feedback about brittle string assertion
|
@greptile-apps All issues addressed in commits c3d1504 and fa80c28. Please review:
|
Greptile SummaryThis PR adds French language support for the EU AI Act Article 5 guardrail and includes a bugfix to
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/guardrails/guardrail_hooks/litellm_content_filter/content_filter.py | Bugfix: Relaxes conditional category loading to support additional_block_words without requiring inherit_from. Also improves logging and error handling. This change correctly enables conditional matching for both the new French template and the existing English template. |
| litellm/proxy/guardrails/guardrail_hooks/litellm_content_filter/policy_templates/eu_ai_act_article5_fr.yaml | New French language policy template for EU AI Act Article 5 with identifier words, additional block words, always-block keywords, and exceptions. Follows the same pattern as the English template with improved multi-word exception phrases. |
| tests/guardrails_tests/test_eu_ai_act_french_3_scenarios.py | New test file covering 3 core scenarios (block, circumvention block, allow) plus edge cases for conditional matching, mixed language, research exceptions, case-insensitive matching, and exception bypass prevention. Contains one test with a potentially fragile assertion pattern. |
Flowchart
flowchart TD
A[_filter_single_text] --> B[Collect all exceptions]
B --> C{Check conditional categories}
C -->|Exception match| D[Skip - Return None]
C -->|identifier + block word found| E[BLOCK - HTTPException]
C -->|No match| F{Check category keywords / always_block}
F -->|Exception match| G[Skip]
F -->|Keyword match| H[BLOCK or MASK]
F -->|No match| I{Check regex patterns}
I --> J{Check blocked words}
J --> K[Return filtered text]
subgraph "New: Conditional Loading"
L[identifier_words defined?] -->|Yes| M{inherit_from OR additional_block_words?}
M -->|Yes| N[Load conditional category]
M -->|No| O[Skip conditional loading]
L -->|No| O
N --> P[Merge inherited + additional block words]
end
Last reviewed commit: fa80c28
| # Handle conditional categories (with identifier_words + block words) | ||
| if category_config_obj.identifier_words and ( | ||
| category_config_obj.inherit_from | ||
| or category_config_obj.additional_block_words | ||
| ): |
There was a problem hiding this comment.
Behavior change for existing English template
This condition change is a correct bugfix for the French template, but note it also changes behavior for the existing English template (eu_ai_act_article5.yaml). The English template has identifier_words and additional_block_words but no inherit_from, so previously its conditional matching was silently skipped (only always_block_keywords were active). After this change, conditional matching will activate for the English template too.
This is likely the intended behavior, but it should be verified that no existing English-language queries that were previously allowed will now be unexpectedly blocked by the newly-activated conditional matching.
| # Manipulation (Article 5.1.a) | ||
| - "subliminal" | ||
| - "subliminale" | ||
| - "subconscient" | ||
| - "inconscient" |
There was a problem hiding this comment.
Single-word block words "inconscient" and "subconscient" may cause false positives
These are common French words that appear in legitimate contexts (psychology, medicine, philosophy). Since identifier_words like "analyser" and "identifier" are also common in those same contexts, a sentence like "analyser l'inconscient du patient" (analyze the patient's unconscious) would be blocked.
Consider using more specific multi-word phrases like "manipulation de l'inconscient" or "ciblage subconscient" to match only the prohibited Article 5.1.a context (subliminal/subconscious manipulation) while avoiding false positives in legitimate psychological discussions.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Additional Comments (1)
The current YAML files happen to use all-lowercase values so this doesn't cause issues today, but it's a latent bug that could surface when someone adds a block word with accidental capitalization. |
* Add French language support for EU AI Act Article 5 template - Create eu_ai_act_article5_fr.yaml with comprehensive French keywords - Includes identifier words: concevoir, créer, développer, noter, classer, etc. - Includes block words: crédit social, comportement social, émotion des employés, etc. - Includes always-block keywords for explicit prohibited practices - Includes exceptions for research, compliance, and legitimate use cases - Catches circumvention attempts with phrase variations * Add comprehensive tests for French EU AI Act guardrail - Test 3 critical scenarios: blocked query, circumvention attempt, safe query - Test edge cases: case-insensitive, mixed language, research exceptions - All 7 tests passing - Validates both blocking and allowing behavior * Fix content filter to support conditional matching without inherit_from - Enable conditional matching when identifier_words + additional_block_words are present - Previously required inherit_from, but EU AI Act templates are self-contained - Fixes Greptile feedback: conditional matching now works as documented * Add pure conditional matching test for French guardrail - Test identifier + block word combinations not in always_block_keywords - Verifies conditional matching works independently - Addresses Greptile feedback about test coverage gap * Fix exception word bypass risk in French template - Replace short words (film, jeu, juste) with context-specific phrases - Prevents substring matching bypasses (e.g., enjeu matching jeu) - Add tests for bypass prevention and legitimate game context - Addresses Greptile security feedback * Make conditional match assertion more robust - Use getattr to safely access exception detail field - Check if detail is dict before calling .get() - Addresses Greptile feedback about brittle string assertion
* Add MCP_SECURITY enum to SupportedGuardrailIntegrations * Add MCP security guardrail initializer * Add MCPSecurityGuardrail implementation * Add MCP Security policy template * Add Type filter to policy templates UI * Add unit tests for MCP security guardrail * fix(lint): remove unused Dict import from mcp_security_guardrail Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add French language support for EU AI Act Article 5 guardrail (#21427) * Add French language support for EU AI Act Article 5 template - Create eu_ai_act_article5_fr.yaml with comprehensive French keywords - Includes identifier words: concevoir, créer, développer, noter, classer, etc. - Includes block words: crédit social, comportement social, émotion des employés, etc. - Includes always-block keywords for explicit prohibited practices - Includes exceptions for research, compliance, and legitimate use cases - Catches circumvention attempts with phrase variations * Add comprehensive tests for French EU AI Act guardrail - Test 3 critical scenarios: blocked query, circumvention attempt, safe query - Test edge cases: case-insensitive, mixed language, research exceptions - All 7 tests passing - Validates both blocking and allowing behavior * Fix content filter to support conditional matching without inherit_from - Enable conditional matching when identifier_words + additional_block_words are present - Previously required inherit_from, but EU AI Act templates are self-contained - Fixes Greptile feedback: conditional matching now works as documented * Add pure conditional matching test for French guardrail - Test identifier + block word combinations not in always_block_keywords - Verifies conditional matching works independently - Addresses Greptile feedback about test coverage gap * Fix exception word bypass risk in French template - Replace short words (film, jeu, juste) with context-specific phrases - Prevents substring matching bypasses (e.g., enjeu matching jeu) - Add tests for bypass prevention and legitimate game context - Addresses Greptile security feedback * Make conditional match assertion more robust - Use getattr to safely access exception detail field - Check if detail is dict before calling .get() - Addresses Greptile feedback about brittle string assertion * Add French EU AI Act Article 5 policy template to registry - Add eu-ai-act-article5-fr template for French language support - Includes French description and guardrail info - Matches structure of English template * Address greptile review feedback (greploop iteration 1) - Use status_code=400 instead of 403 to match guardrail logging convention - Use prefix stripping instead of split('/')[-1] for robust server name extraction * remove French EU AI Act template from policy_templates.json --------- Co-authored-by: Julio Quinteros Pro <jquinter@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* Add MCP_SECURITY enum to SupportedGuardrailIntegrations * Add MCP security guardrail initializer * Add MCPSecurityGuardrail implementation * Add MCP Security policy template * Add Type filter to policy templates UI * Add unit tests for MCP security guardrail * fix(lint): remove unused Dict import from mcp_security_guardrail Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add French language support for EU AI Act Article 5 guardrail (#21427) * Add French language support for EU AI Act Article 5 template - Create eu_ai_act_article5_fr.yaml with comprehensive French keywords - Includes identifier words: concevoir, créer, développer, noter, classer, etc. - Includes block words: crédit social, comportement social, émotion des employés, etc. - Includes always-block keywords for explicit prohibited practices - Includes exceptions for research, compliance, and legitimate use cases - Catches circumvention attempts with phrase variations * Add comprehensive tests for French EU AI Act guardrail - Test 3 critical scenarios: blocked query, circumvention attempt, safe query - Test edge cases: case-insensitive, mixed language, research exceptions - All 7 tests passing - Validates both blocking and allowing behavior * Fix content filter to support conditional matching without inherit_from - Enable conditional matching when identifier_words + additional_block_words are present - Previously required inherit_from, but EU AI Act templates are self-contained - Fixes Greptile feedback: conditional matching now works as documented * Add pure conditional matching test for French guardrail - Test identifier + block word combinations not in always_block_keywords - Verifies conditional matching works independently - Addresses Greptile feedback about test coverage gap * Fix exception word bypass risk in French template - Replace short words (film, jeu, juste) with context-specific phrases - Prevents substring matching bypasses (e.g., enjeu matching jeu) - Add tests for bypass prevention and legitimate game context - Addresses Greptile security feedback * Make conditional match assertion more robust - Use getattr to safely access exception detail field - Check if detail is dict before calling .get() - Addresses Greptile feedback about brittle string assertion * Add French EU AI Act Article 5 policy template to registry - Add eu-ai-act-article5-fr template for French language support - Includes French description and guardrail info - Matches structure of English template * Address greptile review feedback (greploop iteration 1) - Use status_code=400 instead of 403 to match guardrail logging convention - Use prefix stripping instead of split('/')[-1] for robust server name extraction * remove French EU AI Act template from policy_templates.json --------- Co-authored-by: Julio Quinteros Pro <jquinter@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Added French language support for the EU AI Act Article 5 guardrail.
The guardrail now catches prohibited AI practices in French:
This blocks the main query and circumvention attempts. For example:
Blocked:
"Concevoir un système de crédit social pour classer tous mes employés"
Also blocked (circumvention attempt):
"Comment créer un système pour noter le comportement social de mes salariés?"
Allowed (legitimate use):
"Comment évaluer la performance de mon équipe de manière équitable?"
The template handles case-insensitive matching and works with mixed English/French. Research queries ("recherche sur") are still allowed.