Skip to content

defCommunicateWithRequiredDataKeys heuristic creates unusable schemas for most keys #2

@obra

Description

@obra

Context

Toil declares per-node required data keys via `outputs:` in workflow YAML, which become the `SERF_SUBMIT_RESULT_REQUIRED_DATA_KEYS` env var. Serf's `cmdutil.go:36` reads it and wraps the communicate tool via `agent.WithCommunicateRequiredDataKeys(...)`.

The bug

`agent/profile_overrides.go:defCommunicateWithRequiredDataKeys` (lines 186–276) generates a JSON schema for the `output.data` field by guessing each required key's type from its name:

```go
case k == "components": propSchema = componentsSchema()
case k == "tasks": propSchema = tasksSchema()
case strings.HasSuffix(k, "_doc") || "_document" || "_markdown": propSchema = { type: "string" }
case strings.HasSuffix(k, "_list") || "_ids" || ("s" && !"_results"): propSchema = { type: "array", items: { type: "string" } }
default: propSchema = { type: "object", additionalProperties: false }
```

The default branch, combined with OpenAI strict mode, forces the key to be an empty object. `additionalProperties: false` + no `properties` definition means the model can only emit `{}`.

Observed failures

Two runs affected by this today in toil's `pivot-cedar-thistle` deliver:

  • `surgeon` (in `plan_and_build.yaml`): when I declared `outputs: [plan]`, `data.plan` was forced to `{}`. The agent correctly observed the constraint ("Because the output schema only accepts an empty `data.plan` object here, the full structured plan is included below") and shoved the real plan into the `message` field as markdown. The `plan_reviewer` then rejected the (empty) plan.

  • `e2e_tester` (in `integrate.yaml`): pre-existing. `outputs: [story_results]`. `story_results` ends in `_results`, hits the `_results` exception on the array heuristic → falls to `default` → forced empty. Silent data loss: `fern-signal-river` has `data.story_results = {}` despite the e2e_tester having content.

Only two serf nodes in toil declare `outputs:` today, and both are broken by this.

Why the heuristic doesn't work

Naming-based type inference guesses at a shape without enough information. Any output key whose name doesn't match one of the four narrow patterns (`_doc`, `_markdown`, `_list`, `_ids`, plural-s) becomes `additionalProperties: false` with no properties — i.e., required to be empty. That's the opposite of what the caller wanted.

Options for a fix

  • Don't emit a schema for unknown keys. When the caller only declares a key name (no shape), the schema should permit any value. But OpenAI strict mode rejects `{type: object}` without `additionalProperties: false`.
  • Use `additionalProperties: true` for unknown-shape keys. Not accepted by OpenAI strict mode.
  • Let toil declare full schemas. Change `node.outputs` in toil to accept a JSON schema per key rather than just a name, and pass the schema through verbatim. Forces workflow authors to specify shape, which they should anyway.
  • Drop OpenAI-strict-mode requirement for this call. Accept that the model may emit non-strict JSON and validate post-hoc. Loses the compile-time guarantee.

Probably the right answer is option 3 combined with a fallback: accept either a key name (and emit a permissive `{}` schema that allows any JSON) or a key+schema pair. Toil's side would need a schema migration too.

Workaround in toil

Until this is fixed: don't declare `outputs: [key]` on serf nodes unless the key name matches a heuristic branch (`_doc`, `_markdown`, `_list`, `_ids`, plural). Reverted the two toil declarations in prime-radiant-inc/toil#3f79143. The validation warning about undeclared outputs is cosmetic; the runtime breakage was real.

Affected

Any caller that sets `SERF_SUBMIT_RESULT_REQUIRED_DATA_KEYS` with keys that don't hit one of the heuristic branches. In practice: toil's software-factory workflows.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions