You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- **Groves**: declarative multi-agent orchestration via GROVE.md manifests. Define topology, bootstrap, governance, confinement, schemas, and spawn contracts in a single file — then run it.
- **Bootstrap**: pre-fill task creation forms from file references or inline values.
- **Governance**: dual-layer rule enforcement — prompt guidance (LLM reads the rule) AND runtime blocking (code enforces the rule). Supports `shell_pattern_block` and `action_block` hard rules.
- **Confinement**: per-skill filesystem restrictions enforced at the `file_read`, `file_write`, and `execute_shell` action layer.
- **Schema validation**: JSON Schema (Draft 2020-12) validation on `file_write` actions matching glob path patterns.
- **Spawn contracts**: declared parent→child topology edges with auto-injection of skills, profile (with fallback), and constraints.
- **PathSecurity**: three-layer defense against path traversal and symlink attacks across all grove resolvers.
- Two example groves shipped: **livebench** (6-category benchmark) and **mmlu-pro** (14-subject multiple-choice benchmark).
- Children context enrichment: agents see their children's latest message preview and status directly in the prompt.
- Correction feedback injection: parent feedback is injected into child prompts with lifecycle tracking and root stall notification when children stop progressing.
### Fixed
- Per-model consensus queries now run in parallel instead of sequentially.
- Context window overflow from insufficient token safety margins across multiple providers.
- Correction feedback cleared by queued messages arriving during retry.
- Missing `:id` field in system stall messages crashing the Mailbox UI.
- Empty children signal not injected when no live children exist.
- GPT wait-stall pattern causing agents to idle indefinitely.
- LLM receive timeout too low for slow providers (increased to 300s).
### Changed
- Comprehensive Groves documentation added to README.
- DRY refactors: children tracking, skill metadata construction, single-model persist path, PerModelQuery extraction.
Copy file name to clipboardExpand all lines: CHANGELOG.md
+30Lines changed: 30 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,36 @@ All notable changes to this project will be documented in this file.
5
5
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
8
+
## [0.2.0] - 2026-03-10
9
+
10
+
### Added
11
+
12
+
-**Groves**: declarative multi-agent orchestration via GROVE.md manifests. Define topology, bootstrap, governance, confinement, schemas, and spawn contracts in a single file — then run it.
13
+
-**Bootstrap**: pre-fill task creation forms from file references or inline values.
14
+
-**Governance**: dual-layer rule enforcement — prompt guidance (LLM reads the rule) AND runtime blocking (code enforces the rule). Supports `shell_pattern_block` and `action_block` hard rules.
15
+
-**Confinement**: per-skill filesystem restrictions enforced at the `file_read`, `file_write`, and `execute_shell` action layer.
-**Spawn contracts**: declared parent→child topology edges with auto-injection of skills, profile (with fallback), and constraints.
18
+
-**PathSecurity**: three-layer defense against path traversal and symlink attacks across all grove resolvers.
19
+
- Two example groves shipped: **livebench** (6-category benchmark) and **mmlu-pro** (14-subject multiple-choice benchmark).
20
+
- Children context enrichment: agents see their children's latest message preview and status directly in the prompt.
21
+
- Correction feedback injection: parent feedback is injected into child prompts with lifecycle tracking and root stall notification when children stop progressing.
22
+
23
+
### Fixed
24
+
25
+
- Per-model consensus queries now run in parallel instead of sequentially.
26
+
- Context window overflow from insufficient token safety margins across multiple providers.
27
+
- Correction feedback cleared by queued messages arriving during retry.
28
+
- Missing `:id` field in system stall messages crashing the Mailbox UI.
29
+
- Empty children signal not injected when no live children exist.
30
+
- GPT wait-stall pattern causing agents to idle indefinitely.
31
+
- LLM receive timeout too low for slow providers (increased to 300s).
32
+
33
+
### Changed
34
+
35
+
- Comprehensive Groves documentation added to README.
-[Creating Your Own Grove](#creating-your-own-grove)
31
41
-[Security](#security)
32
42
-[Tech Rundown](#tech-rundown)
33
43
-[Agent Architecture](#agent-architecture)
@@ -318,6 +328,255 @@ The prompt fields map directly to how the LLM sees its instructions. A few patte
318
328
319
329
**Approach Guidance** is your chance to nudge the methodology without mandating it. "Consider using blue-green deployment" is a suggestion; putting it in constraints makes it a rule.
320
330
331
+
## Groves
332
+
333
+
Skills tell a single agent _how to do something_. But non-trivial tasks need _trees_ of agents working together -- a coordinator dispatching workers, governance rules that apply to the whole hierarchy, schemas that keep shared data consistent, filesystem boundaries that keep agents in their lane. When all of that coordination lives in natural language inside skill files, agents forget rules under context pressure, misconfigure children, and produce malformed output. The failure rate compounds with tree depth.
334
+
335
+
Groves solve this by moving coordination logic out of prose and into a machine-readable manifest that Quoracle enforces mechanically. A grove declares the full agent tree: who spawns whom, what rules apply, what data contracts exist, and how to start the whole thing with a single click.
336
+
337
+
**The analogy:** If skills are Docker containers, groves are Docker Compose files. Individual containers are useful on their own. A compose file declares how they work together -- networking, volumes, startup order, shared config. You `docker-compose up` instead of manually starting each container with the right flags.
338
+
339
+
### What's in a Grove
340
+
341
+
A grove is a directory containing a manifest (`GROVE.md`), skills, governance rules, schemas, and bootstrap configuration:
342
+
343
+
```
344
+
~/.quoracle/groves/
345
+
my-grove/
346
+
GROVE.md # Manifest (required)
347
+
skills/ # Skills belonging to this grove
348
+
coordinator/
349
+
SKILL.md
350
+
worker/
351
+
SKILL.md
352
+
governance/ # Rules injected into agent prompts
353
+
safety-rules.md
354
+
schemas/ # JSON Schema for data validation
355
+
output.schema.json
356
+
bootstrap/ # Pre-fills the task creation form
357
+
global-context.md
358
+
task-description.md
359
+
success-criteria.md
360
+
scripts/ # Supporting tooling
361
+
README.md # Optional documentation
362
+
```
363
+
364
+
By default, Quoracle looks for groves in `~/.quoracle/groves/`. You can change this in **Settings > System**. Quoracle also ships with example groves in `priv/groves/` -- copy them to your groves directory to use them.
365
+
366
+
**Skill resolution:** When an agent requests a skill by name, Quoracle checks the active grove's `skills/` directory first, then falls back to the global `~/.quoracle/skills/` directory. Grove-local skills shadow global skills of the same name, so a grove can carry customized versions without affecting anything else.
367
+
368
+
### The GROVE.md Manifest
369
+
370
+
Same format as SKILL.md -- YAML frontmatter between `---` delimiters. The frontmatter declares everything Quoracle needs to bootstrap and enforce the agent tree:
371
+
372
+
```yaml
373
+
---
374
+
name: my-grove
375
+
description: >
376
+
Multi-agent research system. Coordinator dispatches
The `bootstrap` section pre-fills the task creation form when you select a grove from the dropdown on the dashboard. Instead of copy-pasting role descriptions, constraints, and context into a dozen form fields every time, you select the grove and the form fills itself.
439
+
440
+
Bootstrap supports two kinds of fields:
441
+
442
+
**File references** (read from the grove directory at selection time):
File paths are relative to the grove root. Path traversal attempts (`../`, absolute paths, symlinks escaping the grove) are rejected.
449
+
450
+
### Governance
451
+
452
+
Governance rules are the things you _really_ don't want an agent to forget under context pressure. Instead of inlining "CRITICAL: Never run destructive commands" into every skill and hoping the LLM retains it, you declare it once in the manifest and Quoracle enforces it at two layers: the prompt (so the model knows the rule) _and_ the runtime (so it can't violate it even if it tries).
453
+
454
+
**Hard rules** come in two types:
455
+
456
+
`shell_pattern_block` -- rejects shell commands matching a regex pattern before they execute:
457
+
```yaml
458
+
- type: shell_pattern_block
459
+
pattern: "rm -rf|dd if="
460
+
message: "Destructive commands are forbidden."
461
+
scope: all
462
+
```
463
+
464
+
`action_block` -- blocks specific action types entirely:
465
+
```yaml
466
+
- type: action_block
467
+
actions: [execute_shell, call_mcp]
468
+
message: "Workers may not run shell commands or MCP tools."
469
+
scope: [worker]
470
+
```
471
+
472
+
`scope`controls which skills the rule applies to. Use a list of skill names, or `all` for every agent in the grove.
473
+
474
+
**Injections** are governance documents that get auto-injected into agent system prompts:
475
+
```yaml
476
+
injections:
477
+
- source: governance/safety-rules.md
478
+
inject_into: [coordinator, worker]
479
+
priority: high
480
+
```
481
+
482
+
`priority: high` places the content before skill content in the prompt. `normal` (default) places it after. Either way, delivery is guaranteed -- no manual inlining, no version drift, one source of truth.
483
+
484
+
### Filesystem Confinement
485
+
486
+
The `confinement` section declares which paths each skill is allowed to read and write. Quoracle enforces this at the action layer -- `file_read`, `file_write`, and `execute_shell` all check confinement before proceeding.
487
+
488
+
```yaml
489
+
confinement:
490
+
coordinator:
491
+
paths: # Read + write
492
+
- ~/.quoracle/projects/runs/**
493
+
read_only_paths: # Read only
494
+
- ~/.quoracle/projects/data/**
495
+
worker:
496
+
read_only_paths:
497
+
- ~/.quoracle/projects/data/**
498
+
```
499
+
500
+
Patterns support `*` (single directory segment) and `**` (any depth). Tilde (`~`) is expanded at parse time. A coordinator that tries to write outside its declared paths gets a confinement violation error. A worker that tries to write _anything_ (only `read_only_paths` declared) gets blocked.
501
+
502
+
Skills not listed in `confinement` are unrestricted. This is intentional -- confinement is opt-in per skill, and unlisted skills get a log warning rather than a hard failure.
503
+
504
+
### Schema Validation
505
+
506
+
The `schemas` section defines JSON Schema files that Quoracle validates against before writing. If an agent tries to write a file with a missing required field or a wrong type, the write is rejected with field-level error messages that the agent can act on.
507
+
508
+
```yaml
509
+
schemas:
510
+
- name: output.json
511
+
definition: schemas/output.schema.json
512
+
validate_on: file_write
513
+
path_pattern: "runs/*/output.json"
514
+
```
515
+
516
+
`path_pattern`is a glob relative to the grove's `workspace`. Only files matching the pattern are validated -- everything else passes through. When multiple schemas match (unlikely but possible), the most specific pattern wins.
517
+
518
+
Schema files use standard JSON Schema (Draft 2020-12). Put them in the grove's `schemas/` directory.
519
+
520
+
### Spawn Topology
521
+
522
+
The `topology` section declares the expected agent tree structure -- who spawns whom, and what gets auto-injected when they do.
523
+
524
+
```yaml
525
+
topology:
526
+
root: coordinator
527
+
edges:
528
+
- parent: coordinator
529
+
child: worker
530
+
auto_inject:
531
+
skills: [worker]
532
+
profile: my-profile
533
+
```
534
+
535
+
When a coordinator spawns a child matching the `worker` skill, Quoracle automatically injects the declared skills and profile. The parent agent still decides _when_ to spawn, but doesn't have to remember _how_ to configure the child correctly.
536
+
537
+
**Auto-inject fields:**
538
+
- `skills`-- merged with any skills the parent already specified (union, no duplicates)
539
+
- `profile`-- used as a fallback if the parent didn't specify one
540
+
- `constraints`-- a file path (optionally with `#section-name` anchor) whose content gets merged with downstream constraints
541
+
542
+
Edges declare _valid_ relationships, not mandatory ones. An agent is free to not spawn a child if it doesn't need to. The topology is a declaration of what's expected, not an execution plan.
543
+
544
+
### Shipped Groves
545
+
546
+
Quoracle ships with two groves in `priv/groves/` that demonstrate the full feature set. Both are LLM benchmarks -- they make good examples because benchmarks naturally need everything groves offer: coordinated agent trees, strict governance (no cheating), filesystem confinement, and schema-validated output.
547
+
548
+
**mmlu-pro** -- 12,032 multiple-choice questions across 14 academic subjects. A coordinator dispatches one answerer per question, collects results, and scores everything via a shell script. Governance blocks internet access and external knowledge sources for answerers. Schema validation ensures well-formed reports.
549
+
550
+
**livebench** -- ~1,150 questions per release across 6 categories (math, reasoning, coding, language, data analysis, instruction following). Same coordinator/worker pattern with category-specific Python scoring scripts.
551
+
552
+
To try them:
553
+
554
+
```bash
555
+
mkdir -p ~/.quoracle/groves
556
+
cp -r priv/groves/mmlu-pro ~/.quoracle/groves/
557
+
cp -r priv/groves/livebench ~/.quoracle/groves/
558
+
```
559
+
560
+
Each grove has a `README.md` with setup instructions (dataset preparation, Python dependencies for scoring, etc.) -- read it before running. Then select the grove from the dropdown when creating a new task and the bootstrap config will pre-fill the form.
561
+
562
+
Of course, groves aren't just for benchmarks. Any multi-agent workflow benefits -- research pipelines, code review systems, data processing chains, anything where a coordinator delegates to specialized workers and you want the coordination enforced rather than hoped for.
563
+
564
+
### Creating Your Own Grove
565
+
566
+
The simplest grove is a `GROVE.md` with a `bootstrap` section -- it just pre-fills the task form. Add sections as your system grows more complex:
567
+
568
+
1. **Start with bootstrap.** Define the role, skills, and prompt fragments your root agent needs. This alone saves you from manually filling out the task form every time.
569
+
570
+
2. **Add governance when you need rules.** If you find yourself inlining "NEVER do X" into every skill, move it to a governance injection. If you need mechanical enforcement (not just prompting), add hard rules.
571
+
572
+
3. **Add topology when you spawn children.** If your root agent spawns child agents, declare the edges so children get auto-injected with the right skills and profiles.
573
+
574
+
4. **Add confinement when you need boundaries.** If agents should only read/write specific directories, declare the paths. This is especially important for agents with `local_execution` or `file_write` capabilities.
575
+
576
+
5. **Add schemas when you share structured data.** If agents write JSON files that other agents or scripts need to parse, define a JSON Schema so malformed writes are caught at write time rather than downstream.
577
+
578
+
Groves are the unit of distribution. If you build something useful, the entire grove directory is self-contained and shareable.
579
+
321
580
## Security
322
581
323
582
Quoracle stores API keys and secrets encrypted at rest using AES-256-GCM via [Cloak](https://hexdocs.pm/cloak_ecto). Sensitive values in action parameters (like `{{SECRET:my_api_key}}`) are resolved at execution time and scrubbed from results before they're fed back to the LLMs.
Every component receives its PubSub instance as an explicit parameter -- no global topics, no named processes, no process dictionary. This means the full test suite of 5900+ tests runs with `async: true`.
667
+
Every component receives its PubSub instance as an explicit parameter -- no global topics, no named processes, no process dictionary. This means the full test suite of 6000+ tests runs with `async: true`.
409
668
410
669
## Configuration Reference
411
670
@@ -436,6 +695,7 @@ Things that work well:
436
695
- Capability-based action gating
437
696
- Persistent state with task restoration on restart
438
697
- Local/self-hosted model support (Ollama, vLLM, LM Studio, LlamaCpp, TGI)
698
+
- Grove-based task templates with JSON Schema validation for file writes and hard rule enforcement (shell pattern blocking, filesystem confinement per agent role)
@@ -13,4 +13,7 @@ Frontend assets for Phoenix web interface
13
13
- esbuild: ES2017 target, bundles JS to priv/static/assets/app.js (gitignored)
14
14
- Tailwind: Scans .ex/.heex, compiles CSS to priv/static/assets/app.css (gitignored)
15
15
- Heroicons: Embedded via Tailwind plugin
16
-
- Phoenix watchers: Auto-rebuild in dev
16
+
- Phoenix watchers: Auto-rebuild in dev
17
+
18
+
## JS Hooks
19
+
-**GrovePrefill**: Handles `grove_prefill` push_event from server. Populates or clears 13 form fields in NewTaskModal. On `{clear: true}`, clears all fields. Otherwise writes all payload values including empty strings (prevents stale carryover on grove A→B switching).
0 commit comments