You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/llm-gateway/features/security-guardrails.md
+29-13Lines changed: 29 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -56,27 +56,43 @@ Adversarial categories only support `block` and `monitor` - redact/anonymize fal
56
56
57
57
## Per-Category Risk Level
58
58
59
-
Each data-risk category has a configurable **Risk Level** that controls how aggressively it fires. Each individual detection carries an internal severity (High / Medium / Low); the Risk Level you set is the *threshold* that decides which severities are caught.
59
+
Each data-risk category has a configurable **Risk Level** that controls how wide a net the category casts. Raise it to catch more sub-categories; lower it to limit detections to only the most unambiguous values.
60
60
61
-
:::caution Risk Level is inverted vs. detection severity
62
-
Risk Level describes how much risk the category represents to you, not the minimum severity to catch. Setting Risk Level to **High** means you treat this category as high-risk and want the widest net - including low-severity matches. Setting it to **Low** means you only want the strongest, most confident matches.
63
-
:::
61
+
| Risk Level | What fires |
62
+
|------------|------------|
63
+
|**Low**| Only the most unambiguous sub-categories - clearly sensitive values like unique identifiers or structured credentials. |
64
+
|**High**| Everything Low catches, plus weaker, contextual sub-categories like names, emails, and amounts. |
64
65
65
-
| Risk Level | Detections caught | Use when |
66
-
|------------|-------------------|----------|
67
-
|**High**| High + Medium + Low severity | Category is critical - widest net, highest recall, more false positives |
68
-
|**Medium**| High + Medium severity | Balanced - drops the noisiest tier |
69
-
|**Low**| High severity only | You only want alerts on strong, high-confidence matches |
66
+
### Which sub-categories fire at which Risk Level
70
67
71
-
So if PII is set to Risk Level **High**, a low-severity detection like a bare person name (e.g. `NAME`) will still fire. Drop PII to Risk Level **Low** and only unambiguous matches (e.g. a full SSN) will fire.
68
+
| Category | Fires at Risk = Low | Also fires at Risk = High |
69
+
|---|---|---|
70
+
|**PII**| SSN, Passport, Driver's License, National ID | Name, Email, Phone, Date of Birth, Home Address, Employee ID |
71
+
|**PHI**| Medical Appointment, Medical Record Number, Prescription | Medical Facility, Medical Condition, Medical Treatment |
72
+
|**PFI**| Bank Account, Bank Identification Code, PAN Card, Tax Information | Financial Amount, Invoice, Payment Processor, Transaction ID, Customer ID |
73
+
|**PCI**| Credit/Debit Card | — |
74
+
|**Auth & Secrets**| Access Token, API Key, AWS Credentials, Password | Username, Username or Alias |
72
75
73
-
### Category-level Risk Level
76
+
### Examples (PII)
74
77
75
-
Pick the Risk Level for the whole category. For example, PII at **High** catches emails, phone numbers, and names even when context is weak; PII at **Low** only fires on clearly sensitive values.
78
+
How the same input is evaluated at different Risk Levels:
79
+
80
+
| Request body | Risk = Low | Risk = High |
81
+
|---|---|---|
82
+
|`My name is Praneeth Bedapudi and I live in Bengaluru`| Allowed | Detected (`NAME`, `HOME ADDRESS`) |
83
+
|`Reach me at praneeth@example.com or +91-98765-43210`| Allowed | Detected (`EMAIL ADDRESS`, `PHONE NUMBER`) |
84
+
|`My passport number is M1234567`| Detected (`PASSPORT NUMBER`) | Detected (`PASSPORT NUMBER`) |
And a mixed example across categories, both at **Risk = Low**:
88
+
89
+
| Request body | PII | PFI |
90
+
|---|---|---|
91
+
|`Praneeth Bedapudi, PAN ABCDE1234F`| Allowed (Name only fires at Risk = High) | Detected (`PAN CARD`) |
76
92
77
93
### Sub-category Risk Level
78
94
79
-
Within a category, each sub-category (for example the individual PII types like email address, phone number, person name) can be pinned to its own Risk Level. Sub-category settings narrow the category-level setting - a sub-category won't fire at a severity the parent category has excluded.
95
+
Within a category, each sub-category can be pinned to its own Risk Level to override the category-level setting.
80
96
81
97
Raise the Risk Level to catch more. Lower it to cut noise on categories you only want alerts on when confidence is very high.
0 commit comments