Commit e3a67dd
authored
fix(agents): resilient fenced tool-call extraction + builder fail-loudly (#1430)
The Agent Builder told users "✅ Agent Created!" while writing no agent
to disk, ~40% of the time. The model wraps its real \`create_agent\`
call in a \` \`\`\`json \` fence; the base parser skipped all fenced
JSON to avoid grabbing documentation examples — so the tool never ran
and the model's hallucinated success text was returned verbatim. This
also exposed four sites in the UI that silently swap any unresolvable
agent for ChatAgent, and swallowed per-agent load errors in the
registry.
After this fix: fenced tool calls are extracted when they are the only
candidate (unfenced wins when both exist; multiple fenced with no
unfenced is treated as ambiguous docs → None). The builder only surfaces
success after \`create_agent\` actually runs and the file exists.
Unresolvable agents surface a user-friendly error. Registry load
failures are recorded and exposed.
Cross-reference: #1394 (separate bug — cold-start pre-flight timeout for
large models, not fixed here).
## Test plan
- [x] `pytest tests/unit/agents/test_tool_call_extraction.py` — 20 tests
including verbatim Zephyr regression fixture — **PASS**
- [x] `pytest tests/unit/agents/test_builder_fail_loudly.py` — error
result and fabricated-success paths — **PASS**
- [x] `pytest tests/unit/agents/test_builder_fenced_integration.py` —
fenced call fires create_agent end-to-end (mocked LLM) — **PASS**
- [x] `pytest tests/unit/agents/test_registry_load_errors.py` — broken
agent dir records error, discovery continues — **PASS**
- [x] `pytest tests/unit/chat/ui/test_agent_unavailable.py` — all 4
fallback sites return user-friendly error, not ChatAgent — **PASS**
- [x] `gaia eval agent --category tool_selection` vs
`qwen-3.5-35b-3b51ca92` baseline — **no regression** (Strix Halo class
hardware; `multi_step_plan` improved FAIL→PASS; `smart_discovery` timing
outlier unrelated to parser change)
- [x] Real-world builder loop ×5 with distinct names against
`Qwen3.5-35B-A3B-GGUF` — **all 5 agents created**, `🔧 Executing
operation / Tool: create_agent` confirmed in server logs for every run,
all 5 appear in `GET /api/agents` via hot-reload (`orion`, `nebula`,
`pulsar`, `quasar`, `vortex`)
Closes #14281 parent 30d5654 commit e3a67dd
10 files changed
Lines changed: 857 additions & 93 deletions
File tree
- src/gaia
- agents
- base
- builder
- ui
- tests/unit
- agents
- chat/ui
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
695 | 695 | | |
696 | 696 | | |
697 | 697 | | |
| 698 | + | |
| 699 | + | |
| 700 | + | |
| 701 | + | |
| 702 | + | |
| 703 | + | |
| 704 | + | |
698 | 705 | | |
699 | 706 | | |
700 | 707 | | |
| |||
705 | 712 | | |
706 | 713 | | |
707 | 714 | | |
708 | | - | |
709 | 715 | | |
710 | 716 | | |
711 | 717 | | |
| |||
723 | 729 | | |
724 | 730 | | |
725 | 731 | | |
726 | | - | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| 749 | + | |
727 | 750 | | |
728 | 751 | | |
729 | 752 | | |
730 | 753 | | |
731 | 754 | | |
732 | 755 | | |
733 | | - | |
734 | | - | |
735 | | - | |
736 | | - | |
| 756 | + | |
737 | 757 | | |
738 | 758 | | |
739 | 759 | | |
| |||
766 | 786 | | |
767 | 787 | | |
768 | 788 | | |
769 | | - | |
770 | 789 | | |
771 | 790 | | |
772 | 791 | | |
773 | | - | |
774 | | - | |
775 | | - | |
776 | | - | |
777 | | - | |
778 | | - | |
779 | | - | |
780 | | - | |
781 | | - | |
782 | | - | |
783 | | - | |
784 | | - | |
785 | | - | |
786 | | - | |
787 | | - | |
788 | | - | |
789 | | - | |
790 | | - | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
791 | 799 | | |
792 | 800 | | |
793 | 801 | | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
794 | 826 | | |
795 | 827 | | |
796 | 828 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
177 | 177 | | |
178 | 178 | | |
179 | 179 | | |
180 | | - | |
181 | | - | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
182 | 185 | | |
183 | 186 | | |
184 | 187 | | |
| |||
230 | 233 | | |
231 | 234 | | |
232 | 235 | | |
| 236 | + | |
233 | 237 | | |
234 | 238 | | |
235 | 239 | | |
| |||
287 | 291 | | |
288 | 292 | | |
289 | 293 | | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
290 | 306 | | |
291 | 307 | | |
292 | 308 | | |
| |||
295 | 311 | | |
296 | 312 | | |
297 | 313 | | |
298 | | - | |
299 | | - | |
300 | | - | |
301 | | - | |
302 | | - | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
303 | 348 | | |
304 | 349 | | |
305 | 350 | | |
306 | | - | |
307 | | - | |
| 351 | + | |
| 352 | + | |
308 | 353 | | |
309 | 354 | | |
310 | 355 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
27 | 31 | | |
28 | 32 | | |
29 | 33 | | |
| |||
39 | 43 | | |
40 | 44 | | |
41 | 45 | | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
42 | 49 | | |
43 | 50 | | |
44 | 51 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
427 | 427 | | |
428 | 428 | | |
429 | 429 | | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
430 | 457 | | |
431 | 458 | | |
432 | 459 | | |
| |||
471 | 498 | | |
472 | 499 | | |
473 | 500 | | |
| 501 | + | |
474 | 502 | | |
475 | 503 | | |
476 | 504 | | |
| |||
1306 | 1334 | | |
1307 | 1335 | | |
1308 | 1336 | | |
| 1337 | + | |
1309 | 1338 | | |
1310 | 1339 | | |
1311 | 1340 | | |
| |||
0 commit comments