Commit 853cdaf
committed
feat: implement Phase 6 safety & content filtering system
- Content classifier with 10-category regex pattern banks + academic mitigation
- PII detector (email, phone, SSN, credit card w/ Luhn, IPv4, DOB, passport, DL)
- Prompt injection detector (10 attack patterns + Unicode + base64 + entropy)
- Domain guards (medical, financial, legal) with warn/block modes + chain
- Input filter (5-stage pipeline) + output filter (3-stage pipeline)
- Central GuardrailEngine with sync/async variants and singleton
- Pure ASGI safety middleware (screens 6 POST endpoints, SSE streaming)
- Safety API routes (/v1/safety/status, /check, /audit)
- SafetyConfig (15 knobs) integrated into Settings
- Safety audit log (JSONL, SHA-256 hashing, ring buffer, file rotation)
- Fixed ThreatLevel string comparison bug in injection detector
- Fixed audit log rotation filename collision within same second
- 165 new safety tests (all passing), 337 total tests passing
- Updated Development_Roadmap.md Phase 6 as COMPLETE1 parent 811dcb2 commit 853cdaf
File tree
16 files changed
+4403
-10
lines changed- docs
- tests
- versaai
- api
- routes
- safety
16 files changed
+4403
-10
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
495 | 495 | | |
496 | 496 | | |
497 | 497 | | |
498 | | - | |
499 | | - | |
500 | | - | |
501 | | - | |
502 | | - | |
503 | | - | |
504 | | - | |
505 | | - | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
506 | 515 | | |
507 | 516 | | |
508 | 517 | | |
509 | 518 | | |
510 | 519 | | |
511 | 520 | | |
512 | | - | |
| 521 | + | |
513 | 522 | | |
514 | 523 | | |
515 | 524 | | |
| |||
0 commit comments