Commit b28e092
committed
feat: add hallucination detection pipeline for output safety
Two-stage pipeline (R9) for detecting hallucinated content in LLM responses:
Stage 1 — Sentinel: Lightweight heuristic that determines if a response
needs detailed fact-checking via word overlap analysis. High-confidence
responses skip Stage 2 to save compute.
Stage 2 — Sentence-level detector: Splits response into sentences and
scores each against the user's prompt for factual consistency using a
cross-encoder model (vectara/hallucination_evaluation_model). Falls back
to heuristic scoring when ML model is unavailable.
Changes:
- Add hallucination_detector.rs with HallucinationDetector struct
- Extend OutputSafetyConfig with hallucination_enabled, hallucination_model,
hallucination_threshold, hallucination_min_response_length fields
- Integrate into OutputAnalyzer via analyze_output_with_prompt()
- Add hallucination config section to config.example.yaml
- Unit tests for detector, threshold behaviour, skip-short-response logic,
sentence splitting, result-to-findings conversion, and integration tests1 parent 41b7f2b commit b28e092
7 files changed
Lines changed: 1285 additions & 9 deletions
File tree
- crates
- llmtrace-core/src
- llmtrace-proxy/src
- llmtrace-security/src
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
297 | 297 | | |
298 | 298 | | |
299 | 299 | | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
300 | 317 | | |
301 | 318 | | |
302 | 319 | | |
| |||
318 | 335 | | |
319 | 336 | | |
320 | 337 | | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
321 | 344 | | |
322 | 345 | | |
323 | 346 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1399 | 1399 | | |
1400 | 1400 | | |
1401 | 1401 | | |
1402 | | - | |
1403 | | - | |
| 1402 | + | |
| 1403 | + | |
1404 | 1404 | | |
1405 | 1405 | | |
1406 | 1406 | | |
| |||
1410 | 1410 | | |
1411 | 1411 | | |
1412 | 1412 | | |
| 1413 | + | |
| 1414 | + | |
| 1415 | + | |
| 1416 | + | |
1413 | 1417 | | |
1414 | 1418 | | |
1415 | 1419 | | |
| |||
1425 | 1429 | | |
1426 | 1430 | | |
1427 | 1431 | | |
| 1432 | + | |
| 1433 | + | |
| 1434 | + | |
| 1435 | + | |
| 1436 | + | |
| 1437 | + | |
| 1438 | + | |
| 1439 | + | |
| 1440 | + | |
| 1441 | + | |
| 1442 | + | |
| 1443 | + | |
| 1444 | + | |
| 1445 | + | |
| 1446 | + | |
| 1447 | + | |
| 1448 | + | |
1428 | 1449 | | |
1429 | 1450 | | |
1430 | 1451 | | |
1431 | 1452 | | |
1432 | 1453 | | |
1433 | 1454 | | |
| 1455 | + | |
| 1456 | + | |
| 1457 | + | |
| 1458 | + | |
| 1459 | + | |
| 1460 | + | |
| 1461 | + | |
| 1462 | + | |
| 1463 | + | |
| 1464 | + | |
| 1465 | + | |
| 1466 | + | |
1434 | 1467 | | |
1435 | 1468 | | |
1436 | 1469 | | |
1437 | 1470 | | |
1438 | 1471 | | |
1439 | 1472 | | |
1440 | 1473 | | |
| 1474 | + | |
| 1475 | + | |
| 1476 | + | |
| 1477 | + | |
1441 | 1478 | | |
1442 | 1479 | | |
1443 | 1480 | | |
| |||
1538 | 1575 | | |
1539 | 1576 | | |
1540 | 1577 | | |
| 1578 | + | |
| 1579 | + | |
1541 | 1580 | | |
1542 | 1581 | | |
1543 | 1582 | | |
| |||
1578 | 1617 | | |
1579 | 1618 | | |
1580 | 1619 | | |
| 1620 | + | |
| 1621 | + | |
| 1622 | + | |
| 1623 | + | |
| 1624 | + | |
| 1625 | + | |
| 1626 | + | |
| 1627 | + | |
| 1628 | + | |
| 1629 | + | |
1581 | 1630 | | |
1582 | 1631 | | |
1583 | 1632 | | |
| |||
1604 | 1653 | | |
1605 | 1654 | | |
1606 | 1655 | | |
| 1656 | + | |
| 1657 | + | |
| 1658 | + | |
| 1659 | + | |
| 1660 | + | |
| 1661 | + | |
| 1662 | + | |
| 1663 | + | |
1607 | 1664 | | |
1608 | 1665 | | |
1609 | 1666 | | |
| |||
1617 | 1674 | | |
1618 | 1675 | | |
1619 | 1676 | | |
| 1677 | + | |
| 1678 | + | |
1620 | 1679 | | |
1621 | 1680 | | |
1622 | 1681 | | |
| |||
2604 | 2663 | | |
2605 | 2664 | | |
2606 | 2665 | | |
| 2666 | + | |
| 2667 | + | |
2607 | 2668 | | |
2608 | 2669 | | |
2609 | 2670 | | |
| |||
3059 | 3120 | | |
3060 | 3121 | | |
3061 | 3122 | | |
| 3123 | + | |
| 3124 | + | |
3062 | 3125 | | |
3063 | 3126 | | |
3064 | 3127 | | |
| |||
3077 | 3140 | | |
3078 | 3141 | | |
3079 | 3142 | | |
| 3143 | + | |
| 3144 | + | |
3080 | 3145 | | |
3081 | 3146 | | |
3082 | 3147 | | |
| |||
3092 | 3157 | | |
3093 | 3158 | | |
3094 | 3159 | | |
| 3160 | + | |
| 3161 | + | |
3095 | 3162 | | |
3096 | 3163 | | |
3097 | 3164 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
959 | 959 | | |
960 | 960 | | |
961 | 961 | | |
| 962 | + | |
962 | 963 | | |
963 | 964 | | |
964 | 965 | | |
| |||
1047 | 1048 | | |
1048 | 1049 | | |
1049 | 1050 | | |
| 1051 | + | |
1050 | 1052 | | |
1051 | 1053 | | |
1052 | 1054 | | |
| |||
1074 | 1076 | | |
1075 | 1077 | | |
1076 | 1078 | | |
| 1079 | + | |
1077 | 1080 | | |
1078 | 1081 | | |
1079 | 1082 | | |
| |||
0 commit comments