Feature idea: extend /goal with intent calibration, evidence chains, and side-thread tolerance

### What variant of Codex are you using?

Codex CLI with local skills and hooks enabled.

### What feature would you like to see?

I would like `/goal` to grow an optional long-run harness layer inspired by a userland Codex skill I have been building: Long Long Run (LLR).

LLR repo:
https://github.com/huahuadeliaoliao/long-long-run

Origin story / design notes:
https://github.com/huahuadeliaoliao/long-long-run/blob/main/docs/why-long-long-run.md

Codex 0.128.0's persisted `/goal` workflows are a strong foundation for long-running work. In practice, though, the hard part is often not only remembering that a goal exists. The harder parts are:

- helping the user clarify the real goal before execution
- reducing intent noise when the user does not know the domain well enough to define the right target
- turning vague intent into a concrete, evidence-backed contract
- tracking which evidence still supports the current plan
- tolerating side questions or late constraints without losing the mainline
- avoiding premature stops when Codex already knows the next goal-covered action
- encouraging real domain exploration before simply validating an answer Codex already expects

LLR experiments with this through two modes:

- INC mode: Intent Noise Cancellation. Codex explores the repo/domain, surfaces assumptions, identifies risks, discovers expert framing, proposes hard acceptance criteria, and builds an evidence-backed contract before implementation.
- ACTIVE mode: the user explicitly authorizes Codex to pursue the confirmed contract as the mainline. Codex should continue through clear contract-covered next steps instead of stopping after every useful local update.

The feature I would like to see is not necessarily LLR copied into Codex directly. Rather, I would like `/goal` to support similar long-run harness semantics, either as built-in behavior or as extension points for skills/plugins.

A possible model:

1. Add a calibration phase before execution

Before a goal becomes an active execution target, Codex could have an explicit calibration mode similar to LLR's INC mode.

In this phase, Codex would:

- explore the repo, task context, and domain
- clarify the user's real objective
- infer hidden requirements
- surface assumptions and risks
- discover current expert framing when the domain matters
- propose hard acceptance criteria
- ask the user to confirm the contract before execution

This is useful because users often cannot describe the correct goal until Codex has helped explore the space.

2. Track current effective evidence

A goal could maintain lightweight current evidence, not just checkpoints.

Possible fields could include:

- user signals
- verified facts
- assumptions
- risks
- open decisions
- next action
- completion signal

The important distinction is:

- checkpoint = what happened
- evidence chain = what still matters and why it should guide the next action

LLR's evidence-chain design has been useful because long-running tasks often produce many artifacts, but later review needs to know which facts still support the mainline and which earlier assumptions were overturned.

3. Add side-thread tolerance

During an active goal, users often remember missing constraints or ask urgent side questions.

A goal workflow could help Codex distinguish whether the latest user message is:

- a side question
- a missing constraint
- a blocker
- a contract change
- a stop request
- a return-to-calibration request

For normal side questions, Codex should answer the user first, then resume the active goal if the main contract is unchanged.

This is one of the most useful parts of LLR in practice: the mainline keeps a compass even when the conversation temporarily walks onto a side path.

4. Add a premature-stop check

If Codex is about to stop while a goal is active, it could re-evaluate:

- Is the objective complete?
- Is the work blocked?
- Did the user ask to stop?
- Has the contract changed enough to require recalibration?
- Is there still a clear next action covered by the goal?

If the next action is clear and still covered by the active goal, Codex should continue instead of asking the user to type "continue".

This is the motivation behind LLR's hook-based stop guard. It is not meant to make Codex run forever. It is meant to prevent the common pattern where Codex states the next step itself, but still stops and waits for the user to approve continuation.

5. Encourage exploration before validation

In research-heavy or fast-moving domains, Codex search can sometimes behave like answer validation: it searches for high-confidence concepts it already knows instead of discovering the domain from task-level keywords.

LLR's INC guidance asks Codex to derive discovery keywords from the user's wording, project vocabulary, file names, data labels, metrics, failure symptoms, tools, quality bar, and ecosystem terms before presenting expert defaults.

This could also be useful around `/goal`, especially when a goal depends on current practice, benchmarks, standards, libraries, or domain conventions.

## Expected outcome

I would love to know whether the Codex team sees these LLR patterns as:

- behaviors that could eventually belong inside `/goal`
- behaviors better handled by skills/plugins
- extension points that `/goal` could expose
- or simply a userland experiment that should remain outside core Codex

Thanks for adding persisted `/goal` workflows. LLR is my attempt to explore the surrounding harness that makes long-running goals easier to define, steer, review, and complete.

### Additional information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature idea: extend /goal with intent calibration, evidence chains, and side-thread tolerance #20958

What variant of Codex are you using?

What feature would you like to see?

Expected outcome

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature idea: extend /goal with intent calibration, evidence chains, and side-thread tolerance #20958

Description

What variant of Codex are you using?

What feature would you like to see?

Expected outcome

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions