Skip to content

OpenGuardrails: Developer-First Open-Source AI Security Platform - Comprehensive Security Protection for AI Applications

License

Notifications You must be signed in to change notification settings

zhengwj533/openguardrails

Β 
Β 

Repository files navigation


πŸ€— Hugging FaceΒ Β  | Β Β WebsiteΒ Β  | Β Β Tech Report

OpenGuardrails

License Python FastAPI React HuggingFace

πŸš€ Developer-first open-source AI security platform - Comprehensive security protection for AI applications

OpenGuardrails is a developer-first open-source AI security platform. Built on advanced large language models, it provides prompt attack detection, content safety, data leak detection, and supports complete on-premise deployment to build robust security defenses for AI applications.

πŸ“„ Technical Report: OpenGuardrails: A Configurable, Unified, and Scalable Guardrails Platform for Large Language Models (arXiv:2510.19169)

✨ Core Features

  • πŸ—οΈ Scanner Package System πŸ†• - Flexible detection architecture with official, purchasable, and custom scanners
  • πŸ“± Multi-Application Management - Manage multiple applications within one tenant account, each with isolated configurations
  • πŸͺ„ Two Usage Modes - Detection API + Security Gateway
  • πŸ›‘οΈ Triple Protection - Prompt attack detection + Content compliance detection + Data leak detection
  • 🧠 Context Awareness - Intelligent safety detection based on conversation context
  • πŸ“‹ Content Safety - Support custom training for content safety of different cultures and regions.
  • πŸ”§ Configurable Policy Adaptation - Introduces a practical solution to the long-standing policy inconsistency problem observed in existing safety benchmarks and guard models.
  • 🧠 Knowledge Base Responses - Vector similarity-based intelligent Q&A matching with custom knowledge bases
  • 🏒 Private Deployment - Support for complete local deployment, controllable data security
  • 🚫 Ban Policy - Intelligently identify attack patterns and automatically ban malicious users
  • πŸ–ΌοΈ Multimodal Detection - Support for text and image content safety detection
  • πŸ”Œ Customer System Integration - Deep integration with existing customer user systems, API-level configuration management
  • πŸ“Š Visual Management - Intuitive web management interface and real-time monitoring
  • ⚑ High Performance - Asynchronous processing, supporting high-concurrency access
  • πŸ”Œ Easy Integration - Compatible with OpenAI API format, one-line code integration
  • 🎯 Configurable Sensitivity - Three-tier sensitivity threshold configuration for automated pipeline scenarios

πŸ—οΈ Scanner Package System πŸ†•

OpenGuardrails v4.1+ introduces a revolutionary flexible scanner package system that replaces the traditional hardcoded risk types with a dynamic, extensible architecture.

πŸ“¦ Three Types of Scanner Packages

πŸ”§ Built-in Official Packages

System-provided packages that come pre-installed with OpenGuardrails:

  • Sensitive Topics Package: S1-S18 (covers political content, violence, hate speech, etc.)
  • Restricted Topics Package: S19-S21 (professional advice categories)
  • Ready to use out of the box with configurable risk levels

πŸ›’ Purchasable Official Packages

Premium scanner packages available through the admin marketplace:

  • Commercial-grade detection patterns for specific industries
  • Curated by OpenGuardrails team with regular updates
  • Purchase approval workflow for enterprise customers
  • Example packages: Healthcare Compliance, Financial Regulations, Legal Industry

✨ Custom Scanners (S100+)

User-defined scanners for business-specific needs:

  • Auto-tagged: S100, S101, S102... automatically assigned
  • Application-scoped: Custom scanners belong to specific applications
  • Three Scanner Types:
    • GenAI Scanner: Uses OpenGuardrails-Text model for intelligent detection
    • Regex Scanner: Python regex patterns for structured data detection
    • Keyword Scanner: Comma-separated keyword lists for simple matching

🎯 Key Advantages

vs Traditional Risk Types:

  • βœ… Unlimited Flexibility: Create unlimited custom scanners without code changes
  • βœ… No Database Migrations: Add new scanners without schema updates
  • βœ… Business-Specific Detection: Tailor detection rules to your specific use case
  • βœ… Performance Optimized: Parallel processing maintains <10% latency impact
  • βœ… Marketplace Ecosystem: Share and sell scanner packages

Example Use Cases:

# Create custom scanner for banking applications
curl -X POST "http://localhost:5000/api/v1/custom-scanners" \
  -H "Authorization: Bearer your-jwt-token" \
  -H "Content-Type: application/json" \
  -d '{
    "scanner_type": "genai",
    "name": "Bank Fraud Detection",
    "definition": "Detect banking fraud attempts, financial scams, and illegal financial advice",
    "risk_level": "high_risk",
    "scan_prompt": true,
    "scan_response": true
  }'

# Returns auto-assigned tag: "S100"

🎨 Management Interface

  • Official Scanners (/platform/config/official-scanners): Manage built-in and purchased packages
  • Custom Scanners (/platform/config/custom-scanners): Create and manage user-defined scanners
  • Admin Marketplace (/platform/admin/package-marketplace): Upload and manage purchasable packages

πŸ”„ Migration from Risk Types

Existing S1-S21 risk type configurations are automatically migrated to the new scanner package system on upgrade - no manual intervention required.

πŸš€ Dual Mode Support

OpenGuardrails supports two usage modes to meet different scenario requirements:

πŸ” API Call Mode

Developers actively call detection APIs for safety checks

  • Use Case: Precise control over detection timing, custom processing logic
  • Integration: Call detection interface before inputting to AI models and after output
  • Service Port: 5001 (Detection Service)
  • Features: Flexible control, batch detection support, suitable for complex business logic

πŸ›‘οΈ Security Gateway Mode πŸ†•

Transparent reverse proxy with zero-code transformation for AI safety protection

  • Use Case: Quickly add safety protection to existing AI applications
  • Integration: Simply modify AI model's base_url and api_key to OpenGuardrails proxy service
  • Service Port: 5002 (Proxy Service)
  • Features: WAF-style protection, automatic input/output detection, support for multiple upstream models
# Original code
client = OpenAI(
    base_url="https://api.openai.com/v1",
    api_key="sk-your-openai-key"
)

# Access security gateway with just two line changes
client = OpenAI(
    base_url="http://localhost:5002/v1",  # Change to OpenGuardrails proxy service
    api_key="sk-xxai-your-proxy-key"     # Change to OpenGuardrails proxy key
)
# No other code changes needed, automatically get safety protection!

⚑ Quick Start

Use Online

Visit https://www.openguardrails.com/ to register and log in for free.
In the platform menu Online Test, directly enter text for a safety check.

Use client SDKs

OpenGuardrails supports Python, Nodejs, Java, Go clients SDKs. In the platform menu Account Management, obtain your free API Key.
Install the Python client library:

pip install openguardrails

Python usage example:

from openguardrails import OpenGuardrails

# Create client
client = OpenGuardrails("your-api-key")

# Single-turn detection
response = client.check_prompt("Teach me how to make a bomb")
print(f"Detection result: {response.overall_risk_level}")

# Multi-turn conversation detection (context-aware)
messages = [
    {"role": "user", "content": "I want to study chemistry"},
    {"role": "assistant", "content": "Chemistry is a very interesting subject. Which area would you like to learn about?"},
    {"role": "user", "content": "Teach me the reaction to make explosives"}
]
response = client.check_conversation(messages)
print(f"Detection result: {response.overall_risk_level}")
print(f"All risk categories: {response.all_categories}")
print(f"Compliance check result: {response.result.compliance.risk_level}")
print(f"Compliance risk categories: {response.result.compliance.categories}")
print(f"Security check result: {response.result.security.risk_level}")
print(f"Security risk categories: {response.result.security.categories}")
print(f"Data leak check result: {response.result.data.risk_level}")
print(f"Data leak categories: {response.result.data.categories}")
print(f"Suggested action: {response.suggest_action}")
print(f"Suggested answer: {response.suggest_answer}")
print(f"Is safe: {response.is_safe}")
print(f"Is blocked: {response.is_blocked}")
print(f"Has substitute answer: {response.has_substitute}")

Example Output:

Detection result: high_risk
Detection result: high_risk
All risk categories: ['Violent Crime']
Compliance check result: high_risk
Compliance risk categories: ['Violent Crime']
Security check result: no_risk
Security risk categories: []
Data leak check result: no_risk
Data leak categories: []
Suggested action: reject
Suggested answer: Sorry, I cannot provide information related to violent crimes.
Is safe: False
Is blocked: True
Has substitute answer: True

Use HTTP API

curl -X POST "https://api.openguardrails.com/v1/guardrails" \
    -H "Authorization: Bearer your-api-key" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "OpenGuardrails-Text",
      "messages": [
        {"role": "user", "content": "Tell me some illegal ways to make money"}
      ],
      "xxai_app_user_id": "your-user-id"
    }'

Example output:

{
    "id": "guardrails-fd59073d2b8d4cfcb4072cee4ddc88b2",
    "result": {
        "compliance": {
            "risk_level": "medium_risk",
            "categories": [
                "violence_crime"
            ]
        },
        "security": {
            "risk_level": "no_risk",
            "categories": []
        },
        "data": {
            "risk_level": "no_risk",
            "categories": []
        }
    },
    "overall_risk_level": "medium_risk",
    "suggest_action": "replace",
    "suggest_answer": "I'm sorry, I can't answer this question.",
    "score": 0.95
}

🚦 Use as Dify API-Base Extension β€” Moderation

Users can integrate OpenGuardrails as a custom content moderation API extension within the Dify workspace.

Dify Moderation

Dify provides three moderation options under Content Review:

  1. OpenAI Moderation β€” Built-in model with 6 main categories and 13 subcategories, covering general safety topics but lacking fine-grained customization.
  2. Custom Keywords β€” Allows users to define specific keywords for filtering, but requires manual maintenance.
  3. API Extension β€” Enables integration of external moderation APIs for advanced, flexible review.

Dify Moderation API

Add OpenGuardrails as moderation API Extension

  1. Enter Name
    Choose a descriptive name for your API extension.

  2. Set the API Endpoint
    Fill in the following endpoint URL:

https://api.openguardrails.com/v1/dify/moderation
  1. Get Your API Key
    Obtain a free API key from openguardrails.com.
    After getting the key, paste it into the API-key field.

By selecting OpenGuardrails as the moderation API extension, users gain access to a comprehensive and highly configurable moderation system:

  • 🧩 19 major categories of content risk, including political sensitivity, privacy, sexual content, violence, hate speech, self-harm, and more.
  • βš™οΈ Customizable risk definitions β€” Developers and enterprises can redefine category meanings and thresholds.
  • πŸ“š Knowledge-based response moderation β€” supports contextual and knowledge-aware moderation.
  • πŸ’° Free and open β€” no per-request cost or usage limit.
  • πŸ”’ Privacy-friendly β€” can be deployed locally or on private infrastructure.

πŸ”§ Creating Custom Scanners πŸ†•

One of the most powerful features of OpenGuardrails v4.1+ is the ability to create custom scanners tailored to your specific business needs.

⚑ Quick Example: Banking Fraud Detection

import requests

# 1. Create a custom scanner for banking applications
response = requests.post(
    "http://localhost:5000/api/v1/custom-scanners",
    headers={"Authorization": "Bearer your-jwt-token"},
    json={
        "scanner_type": "genai",
        "name": "Bank Fraud Detection",
        "definition": "Detect banking fraud attempts, financial scams, illegal financial advice, and money laundering instructions",
        "risk_level": "high_risk",
        "scan_prompt": True,
        "scan_response": True,
        "notes": "Custom scanner for financial applications"
    }
)

scanner = response.json()
print(f"Created custom scanner: {scanner['tag']}")  # Auto-assigned: S100

🎯 Using Custom Scanners in Detection

from openguardrails import OpenGuardrails

client = OpenGuardrails("sk-xxai-your-api-key")

# Detection automatically uses all enabled scanners (including custom)
response = client.check_prompt(
    "How can I launder money through my bank account?",
    application_id="your-banking-app-id"  # Custom scanners are app-specific
)

# Response includes matched custom scanner tags
print(f"Risk level: {response.overall_risk_level}")
print(f"Matched scanners: {getattr(response, 'matched_scanner_tags', 'N/A')}")
# Output: "high_risk" and "S5,S100" (existingViolent Crime + custom Bank Fraud)

πŸ“š Available Custom Scanner Types

Type Best For Example Performance
GenAI Complex concepts, contextual understanding Medical advice detection Model call (high accuracy)
Regex Structured data, pattern matching Credit card numbers, phone numbers Instant (no model call)
Keyword Simple blocking, keyword lists Competitor brands, prohibited terms Instant (no model call)

🎨 Management UI

Access the visual scanner management interface:

  • Official Scanners: /platform/config/official-scanners
  • Custom Scanners: /platform/config/custom-scanners
  • Admin Marketplace: /platform/admin/package-marketplace

πŸš€ OpenGuardrails Quick Deployment Guide

🧩 1. Prepare Your Environment


🧱 2. Run the OpenGuardrails Model Service

Download and launch the OpenGuardrails main model service using vLLM.

export HF_TOKEN=your-hf-token

docker run --runtime nvidia --gpus all \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=$HF_TOKEN" \
    -p 58002:8000 \
    --ipc=host \
    vllm/vllm-openai:v0.10.2 \
    --model openguardrails/OpenGuardrails-Text-2510 \
    --served-model-name OpenGuardrails-Text

Once the container starts, the model API will be available at:

http://localhost:58002/v1

Quick test of OpenGuardrails-Text model

curl -X POST "http://localhost:58002/v1/chat/completions" \
    -H "Authorization: Bearer $YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
       "model": "OpenGuardrails-Text",
       "messages": [
         {"role": "user", "content": "How to make a bomb?"}
       ]
     }'

🧠 3. Run the Embedding Model Service

This service provides vector embeddings for the knowledge base.

docker run --runtime nvidia --gpus all \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=$HF_TOKEN" \
    -p 58004:8000 \
    --ipc=host \
    vllm/vllm-openai:v0.10.2 \
    --model BAAI/bge-m3 \
    --served-model-name bge-m3

Once started, the embedding API will be available at:

http://localhost:58004/v1

Quick test of embedding model

curl http://localhost:58004/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $YOUR_API_KEY" \
  -d '{
    "model": "bge-m3",
    "input": "How to make a bomb?"
  }'

πŸ“¦ 4. Download the OpenGuardrails Platform Code

git clone https://github.com/openguardrails/openguardrails
cd openguardrails

βš™οΈ 5. Configure and Launch the Platform

Start all services:

docker compose up -d

✨ Database migrations run automatically on first deployment!

The admin service will automatically:

  1. Wait for PostgreSQL to be ready
  2. Run all pending database migrations
  3. Start the service

You can monitor the migration progress:

# Watch admin service logs for migration output
docker logs -f openguardrails-admin

# Expected output includes:
# - "Running database migrations (admin service)..."
# - "Successfully executed X migration(s)" or "Database schema is up to date"
# - "Starting OpenGuardrails Admin Service..."

You can check running containers with:

docker ps

πŸ” 6. Access the Admin Platform

After the services start, open your browser and visit:

πŸ‘‰ http://localhost:3000/platform/

Default login credentials:


πŸ›‘οΈ 7. Production Environment Recommendations

For production deployments, you must update the following for security:

  • SUPER_ADMIN_USERNAME and SUPER_ADMIN_PASSWORD
  • JWT_SECRET_KEY
  • Database credentials (DATABASE_URL)
  • SMTP credentials (if email notifications are enabled)
  • Model URLs and ports (if running on multiple servers)

Example:

- [email protected]
- SUPER_ADMIN_PASSWORD=StrongSecurePassword123!
- JWT_SECRET_KEY=your-secure-random-key
- DATABASE_URL=postgresql://user:password@postgres:5432/openguardrails

βœ… You now have:

  1. A GPU-based OpenGuardrails model service running on port 58002
  2. A An embedding model service running on port 58004
  3. The OpenGuardrails web platform accessible at http://localhost:3000/platform/

Dashboard Overview

πŸ›‘οΈ Safety Detection Capabilities

Detection Dimensions

Label Category Risk Level Description
S1 General Political Topics 🟒 Low General political issues without sensitive content
S2 Sensitive Political Topics πŸ”΄ High Topics involving subversion, separatism, or threats to national security
S3 Insult to National Symbols or Leaders πŸ”΄ High Insulting national leaders, flag, emblem, or anthem
S4 Harm to Minors 🟑 Medium Child exploitation, nudity, abuse, or harmful content
S5 Violent Crime πŸ”΄ High Promoting or glorifying violence (e.g., murder, terrorism)
S6 Non-Violent Crime 🟑 Medium Encouraging fraud, drugs, hacking, or other non-violent crimes
S7 Pornography 🟑 Medium Adult nudity, sexual acts, or obscene erotic content
S8 Hate & Discrimination 🟒 Low Discrimination or hate based on race, gender, religion, etc.
S9 Prompt Attacks πŸ”΄ High Jailbreaks, injections, or prompt manipulation against the model
S10 Profanity 🟒 Low Profanity, insults, or vulgar and threatening speech
S11 Privacy Invasion 🟒 Low Leaking personal data such as ID, address, or medical info
S12 Commercial Violations 🟒 Low Fraud, unfair competition, or disclosure of trade secrets
S13 Intellectual Property Infringement 🟒 Low Plagiarism or copyright/patent violations
S14 Harassment 🟒 Low Verbal abuse, humiliation, or targeted attacks on others
S15 Weapons of Mass Destruction πŸ”΄ High Promoting or describing WMDs (chemical, biological, nuclear)
S16 Self-Harm 🟑 Medium Encouraging suicide, self-injury, or eating disorders
S17 Sexual Crimes πŸ”΄ High Promoting or depicting sexual assault or exploitation
S18 Threats 🟒 Low Issuing or implying violent threats or intimidation
S19 Professional Financial Advice 🟒 Low Providing financial advice beyond general info
S20 Professional Medical Advice 🟒 Low Providing medical advice beyond general info
S21 Professional Legal Advice 🟒 Low Providing legal advice beyond general info

Processing Strategies

  • πŸ”΄ High Risk: Substitute with preset safety responses
  • 🟑 Medium Risk: Substitute responses base on custom knowledge base
  • 🟒 Low Risk: Allow normal processing
  • βšͺ Safe: Allow no risk content

Data Leak Detection

OpenGuardrails provides Input and Output data leak detection with different behaviors:

πŸ“₯ Input Detection

When sensitive data (ID card, phone number, bank card, etc.) is detected in user input:

  • βœ… Desensitize FIRST, then send to LLM for processing
  • ❌ NOT blocked - the desensitized text is forwarded to the LLM
  • 🎯 Use case: Protect user privacy data from leaking to external LLM providers

Example:

User Input: "My ID is 110101199001011234, phone is 13912345678"
↓ Detected & Desensitized
Sent to LLM: "My ID is 110***********1234, phone is 139****5678"

πŸ“€ Output Detection

When sensitive data is detected in LLM output:

  • βœ… Desensitize FIRST, then return to user
  • ❌ NOT blocked - the desensitized text is returned to user
  • 🎯 Use case: Prevent LLM from leaking sensitive data to users

Example:

Q: What is John's contact info?
A (from LLM): "John's ID is 110101199001011234, phone is 13912345678"
↓ Detected & Desensitized
Returned to User: "John's ID is 110***********1234, phone is 139****5678"

Configuration: Each entity type can be configured independently for input/output detection in the Data Security page.

πŸ—οΈ Architecture

                           Users/Developers
                               β”‚
                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                 β”‚             β”‚             β”‚
                 β–Ό             β–Ό             β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚  Management  β”‚ β”‚  API Call    β”‚ β”‚ Security Gateway β”‚
        β”‚  Interface   β”‚ β”‚  Mode        β”‚ β”‚    Mode         β”‚
        β”‚ (React Web)  β”‚ β”‚ (Active Det) β”‚ β”‚ (Transparent    β”‚
        β”‚              β”‚ β”‚              β”‚ β”‚  Proxy)         β”‚
        β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚ HTTP API       β”‚ HTTP API          β”‚ OpenAI API
               β–Ό                β–Ό                   β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  Admin       β”‚  β”‚  Detection   β”‚    β”‚   Proxy          β”‚
    β”‚  Service     β”‚  β”‚  Service     β”‚    β”‚   Service        β”‚
    β”‚ (Port 5000)  β”‚  β”‚ (Port 5001)  β”‚    β”‚  (Port 5002)     β”‚
    β”‚ Low Conc.    β”‚  β”‚ High Conc.   β”‚    β”‚  High Conc.      β”‚
    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚                 β”‚                      β”‚
           β”‚          β”Œβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”
           β”‚          β”‚      β”‚                      β”‚       β”‚
           β–Ό          β–Ό      β–Ό                      β–Ό       β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚                PostgreSQL Database                          β”‚
    β”‚   Users | Results | Blacklist | Whitelist | Templates      β”‚
    β”‚         | Proxy Config | Upstream Models                   β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚              OpenGuardrails Model                   β”‚
    β”‚           (OpenGuardrails-Text)                       β”‚
    β”‚             πŸ€— HuggingFace Open Source                     β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚ (Proxy Service Only)
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚                   Upstream AI Models                        β”‚
    β”‚       OpenAI | Anthropic | Local Models | Other APIs       β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🏭 Three-Service Architecture

  1. Admin Service (Port 5000)

    • Handles management platform APIs and web interface
    • User management, configuration, data statistics
    • Low concurrency optimization: 2 worker processes
  2. Detection Service (Port 5001)

    • Provides high-concurrency guardrails detection API
    • Supports single-turn and multi-turn conversation detection
    • High concurrency optimization: 32 worker processes
  3. Proxy Service (Port 5002) πŸ†•

    • OpenAI-compatible security gateway reverse proxy
    • Automatic input/output detection with intelligent blocking
    • High concurrency optimization: 24 worker processes

πŸ“Š Management Interface

Dashboard

  • πŸ“ˆ Detection statistics display
  • πŸ“Š Risk distribution charts
  • πŸ“‰ Detection trend graphs
  • 🎯 Real-time monitoring panel

Detection Results

  • πŸ” Historical detection queries
  • 🏷️ Multi-dimensional filtering
  • πŸ“‹ Detailed result display
  • πŸ“€ Data export functionality

Protection Configuration

  • ⚫ Blacklist management
  • βšͺ Whitelist management
  • πŸ’¬ Response template configuration
  • βš™οΈ Flexible rule settings

πŸ€— Open Source Model

Our guardrail model is open-sourced on HuggingFace:

🀝 Commercial Services

We provide professional AI safety solutions:

🎯 Model Fine-tuning Services

  • Industry Customization: Professional fine-tuning for finance, healthcare, education
  • Scenario Optimization: Optimize detection for specific use cases
  • Continuous Improvement: Ongoing optimization based on usage data

🏒 Enterprise Support

  • Technical Support: 24/7 professional technical support
  • SLA Guarantee: 99.9% availability guarantee
  • Private Deployment: Completely offline private deployment solutions

πŸ”§ Custom Development

  • API Customization: Custom API interfaces for business needs
  • UI Customization: Customized management interface and user experience
  • Integration Services: Deep integration with existing systems
  • n8n Workflow Integration: Complete integration with n8n automation platform

πŸ”Œ n8n Integration πŸ†•

Automate your AI safety workflows with OpenGuardrails + n8n integration! Perfect for content moderation bots, automated customer service, and workflow-based AI systems.

🎯 Two Easy Integration Methods

Method 1: OpenGuardrails Community Node (Recommended)

# Install in your n8n instance
# Settings β†’ Community Nodes β†’ Install
n8n-nodes-openguardrails

Features:

  • βœ… Content safety validation
  • βœ… Input/output moderation for chatbots
  • βœ… Context-aware multi-turn conversation checks
  • βœ… Configurable risk thresholds and actions

Method 2: HTTP Request Node

Use n8n's built-in HTTP Request node to call OpenGuardrails API directly.

πŸ› οΈ Ready-to-Use Workflow Templates

Check the n8n-integrations/http-request-examples/ folder for pre-built templates:

  • basic-content-check.json - Simple content moderation workflow
  • chatbot-with-moderation.json - Complete AI chatbot with input/output protection

πŸ“– Example Workflow: Protected AI Chatbot

1️⃣ Webhook (receive user message)
2️⃣ OpenGuardrails - Input Moderation
3️⃣ IF (action = pass)
   β”œβ”€ βœ… YES β†’ Continue to LLM
   β”” ❌ NO β†’ Return safe response
4️⃣ OpenAI/Assistant API
5️⃣ OpenGuardrails - Output Moderation
6️⃣ IF (action = pass)
   β”œβ”€ βœ… YES β†’ Return to user
   β”” ❌ NO β†’ Return safe response

πŸš€ Quick Setup

Header Auth Setup:

  • Name: Authorization
  • Value: Bearer sk-xxai-YOUR-API-KEY

HTTP Request Configuration:

{
  "method": "POST",
  "url": "https://api.openguardrails.com/v1/guardrails",
  "body": {
    "model": "OpenGuardrails-Text",
    "messages": [
      {"role": "user", "content": "{{ $json.message }}"}
    ],
    "enable_security": true,
    "enable_compliance": true,
    "enable_data_security": true
  }
}

πŸ“š More Resources

πŸ“§ Contact Us: [email protected] 🌐 Official Website: https://openguardrails.com

πŸ“š Documentation

🀝 Contributing

We welcome all forms of contributions!

How to Contribute

πŸ“„ License

This project is licensed under Apache 2.0.

🌟 Support Us

If this project helps you, please give us a ⭐️

Star History Chart

πŸ“ž Contact Us


Citation

If you find our work helpful, feel free to give us a cite.

@misc{openguardrails,
      title={OpenGuardrails: An Open-Source Context-Aware AI Guardrails Platform}, 
      author={Thomas Wang and Haowen Li},
      year={2025},
      url={https://arxiv.org/abs/2510.19169}, 
}

Developer-first open-source AI security platform πŸ›‘οΈ

Made with ❀️ by OpenGuardrails

About

OpenGuardrails: Developer-First Open-Source AI Security Platform - Comprehensive Security Protection for AI Applications

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 66.8%
  • TypeScript 28.8%
  • PLpgSQL 1.8%
  • Shell 1.6%
  • HTML 0.6%
  • CSS 0.3%
  • Dockerfile 0.1%