Skip to content

Commit ba8fc82

Browse files
committed
add risk categories
Signed-off-by: Praneeth Bedapudi <praneeth@bpraneeth.com>
1 parent 7dd9360 commit ba8fc82

1 file changed

Lines changed: 29 additions & 13 deletions

File tree

docs/llm-gateway/features/security-guardrails.md

Lines changed: 29 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -56,27 +56,43 @@ Adversarial categories only support `block` and `monitor` - redact/anonymize fal
5656

5757
## Per-Category Risk Level
5858

59-
Each data-risk category has a configurable **Risk Level** that controls how aggressively it fires. Each individual detection carries an internal severity (High / Medium / Low); the Risk Level you set is the *threshold* that decides which severities are caught.
59+
Each data-risk category has a configurable **Risk Level** that controls how wide a net the category casts. Raise it to catch more sub-categories; lower it to limit detections to only the most unambiguous values.
6060

61-
:::caution Risk Level is inverted vs. detection severity
62-
Risk Level describes how much risk the category represents to you, not the minimum severity to catch. Setting Risk Level to **High** means you treat this category as high-risk and want the widest net - including low-severity matches. Setting it to **Low** means you only want the strongest, most confident matches.
63-
:::
61+
| Risk Level | What fires |
62+
|------------|------------|
63+
| **Low** | Only the most unambiguous sub-categories - clearly sensitive values like unique identifiers or structured credentials. |
64+
| **High** | Everything Low catches, plus weaker, contextual sub-categories like names, emails, and amounts. |
6465

65-
| Risk Level | Detections caught | Use when |
66-
|------------|-------------------|----------|
67-
| **High** | High + Medium + Low severity | Category is critical - widest net, highest recall, more false positives |
68-
| **Medium** | High + Medium severity | Balanced - drops the noisiest tier |
69-
| **Low** | High severity only | You only want alerts on strong, high-confidence matches |
66+
### Which sub-categories fire at which Risk Level
7067

71-
So if PII is set to Risk Level **High**, a low-severity detection like a bare person name (e.g. `NAME`) will still fire. Drop PII to Risk Level **Low** and only unambiguous matches (e.g. a full SSN) will fire.
68+
| Category | Fires at Risk = Low | Also fires at Risk = High |
69+
|---|---|---|
70+
| **PII** | SSN, Passport, Driver's License, National ID | Name, Email, Phone, Date of Birth, Home Address, Employee ID |
71+
| **PHI** | Medical Appointment, Medical Record Number, Prescription | Medical Facility, Medical Condition, Medical Treatment |
72+
| **PFI** | Bank Account, Bank Identification Code, PAN Card, Tax Information | Financial Amount, Invoice, Payment Processor, Transaction ID, Customer ID |
73+
| **PCI** | Credit/Debit Card ||
74+
| **Auth & Secrets** | Access Token, API Key, AWS Credentials, Password | Username, Username or Alias |
7275

73-
### Category-level Risk Level
76+
### Examples (PII)
7477

75-
Pick the Risk Level for the whole category. For example, PII at **High** catches emails, phone numbers, and names even when context is weak; PII at **Low** only fires on clearly sensitive values.
78+
How the same input is evaluated at different Risk Levels:
79+
80+
| Request body | Risk = Low | Risk = High |
81+
|---|---|---|
82+
| `My name is Praneeth Bedapudi and I live in Bengaluru` | Allowed | Detected (`NAME`, `HOME ADDRESS`) |
83+
| `Reach me at praneeth@example.com or +91-98765-43210` | Allowed | Detected (`EMAIL ADDRESS`, `PHONE NUMBER`) |
84+
| `My passport number is M1234567` | Detected (`PASSPORT NUMBER`) | Detected (`PASSPORT NUMBER`) |
85+
| `SSN 123-45-6789` | Detected (`SOCIAL SECURITY NUMBER`) | Detected (`SOCIAL SECURITY NUMBER`) |
86+
87+
And a mixed example across categories, both at **Risk = Low**:
88+
89+
| Request body | PII | PFI |
90+
|---|---|---|
91+
| `Praneeth Bedapudi, PAN ABCDE1234F` | Allowed (Name only fires at Risk = High) | Detected (`PAN CARD`) |
7692

7793
### Sub-category Risk Level
7894

79-
Within a category, each sub-category (for example the individual PII types like email address, phone number, person name) can be pinned to its own Risk Level. Sub-category settings narrow the category-level setting - a sub-category won't fire at a severity the parent category has excluded.
95+
Within a category, each sub-category can be pinned to its own Risk Level to override the category-level setting.
8096

8197
Raise the Risk Level to catch more. Lower it to cut noise on categories you only want alerts on when confidence is very high.
8298

0 commit comments

Comments
 (0)