What variant of Codex are you using?
Codex CLI with local skills and hooks enabled.
What feature would you like to see?
I would like /goal to grow an optional long-run harness layer inspired by a userland Codex skill I have been building: Long Long Run (LLR).
LLR repo:
https://github.com/huahuadeliaoliao/long-long-run
Origin story / design notes:
https://github.com/huahuadeliaoliao/long-long-run/blob/main/docs/why-long-long-run.md
Codex 0.128.0's persisted /goal workflows are a strong foundation for long-running work. In practice, though, the hard part is often not only remembering that a goal exists. The harder parts are:
- helping the user clarify the real goal before execution
- reducing intent noise when the user does not know the domain well enough to define the right target
- turning vague intent into a concrete, evidence-backed contract
- tracking which evidence still supports the current plan
- tolerating side questions or late constraints without losing the mainline
- avoiding premature stops when Codex already knows the next goal-covered action
- encouraging real domain exploration before simply validating an answer Codex already expects
LLR experiments with this through two modes:
- INC mode: Intent Noise Cancellation. Codex explores the repo/domain, surfaces assumptions, identifies risks, discovers expert framing, proposes hard acceptance criteria, and builds an evidence-backed contract before implementation.
- ACTIVE mode: the user explicitly authorizes Codex to pursue the confirmed contract as the mainline. Codex should continue through clear contract-covered next steps instead of stopping after every useful local update.
The feature I would like to see is not necessarily LLR copied into Codex directly. Rather, I would like /goal to support similar long-run harness semantics, either as built-in behavior or as extension points for skills/plugins.
A possible model:
- Add a calibration phase before execution
Before a goal becomes an active execution target, Codex could have an explicit calibration mode similar to LLR's INC mode.
In this phase, Codex would:
- explore the repo, task context, and domain
- clarify the user's real objective
- infer hidden requirements
- surface assumptions and risks
- discover current expert framing when the domain matters
- propose hard acceptance criteria
- ask the user to confirm the contract before execution
This is useful because users often cannot describe the correct goal until Codex has helped explore the space.
- Track current effective evidence
A goal could maintain lightweight current evidence, not just checkpoints.
Possible fields could include:
- user signals
- verified facts
- assumptions
- risks
- open decisions
- next action
- completion signal
The important distinction is:
- checkpoint = what happened
- evidence chain = what still matters and why it should guide the next action
LLR's evidence-chain design has been useful because long-running tasks often produce many artifacts, but later review needs to know which facts still support the mainline and which earlier assumptions were overturned.
- Add side-thread tolerance
During an active goal, users often remember missing constraints or ask urgent side questions.
A goal workflow could help Codex distinguish whether the latest user message is:
- a side question
- a missing constraint
- a blocker
- a contract change
- a stop request
- a return-to-calibration request
For normal side questions, Codex should answer the user first, then resume the active goal if the main contract is unchanged.
This is one of the most useful parts of LLR in practice: the mainline keeps a compass even when the conversation temporarily walks onto a side path.
- Add a premature-stop check
If Codex is about to stop while a goal is active, it could re-evaluate:
- Is the objective complete?
- Is the work blocked?
- Did the user ask to stop?
- Has the contract changed enough to require recalibration?
- Is there still a clear next action covered by the goal?
If the next action is clear and still covered by the active goal, Codex should continue instead of asking the user to type "continue".
This is the motivation behind LLR's hook-based stop guard. It is not meant to make Codex run forever. It is meant to prevent the common pattern where Codex states the next step itself, but still stops and waits for the user to approve continuation.
- Encourage exploration before validation
In research-heavy or fast-moving domains, Codex search can sometimes behave like answer validation: it searches for high-confidence concepts it already knows instead of discovering the domain from task-level keywords.
LLR's INC guidance asks Codex to derive discovery keywords from the user's wording, project vocabulary, file names, data labels, metrics, failure symptoms, tools, quality bar, and ecosystem terms before presenting expert defaults.
This could also be useful around /goal, especially when a goal depends on current practice, benchmarks, standards, libraries, or domain conventions.
Expected outcome
I would love to know whether the Codex team sees these LLR patterns as:
- behaviors that could eventually belong inside
/goal
- behaviors better handled by skills/plugins
- extension points that
/goal could expose
- or simply a userland experiment that should remain outside core Codex
Thanks for adding persisted /goal workflows. LLR is my attempt to explore the surrounding harness that makes long-running goals easier to define, steer, review, and complete.
Additional information
No response
What variant of Codex are you using?
Codex CLI with local skills and hooks enabled.
What feature would you like to see?
I would like
/goalto grow an optional long-run harness layer inspired by a userland Codex skill I have been building: Long Long Run (LLR).LLR repo:
https://github.com/huahuadeliaoliao/long-long-run
Origin story / design notes:
https://github.com/huahuadeliaoliao/long-long-run/blob/main/docs/why-long-long-run.md
Codex 0.128.0's persisted
/goalworkflows are a strong foundation for long-running work. In practice, though, the hard part is often not only remembering that a goal exists. The harder parts are:LLR experiments with this through two modes:
The feature I would like to see is not necessarily LLR copied into Codex directly. Rather, I would like
/goalto support similar long-run harness semantics, either as built-in behavior or as extension points for skills/plugins.A possible model:
Before a goal becomes an active execution target, Codex could have an explicit calibration mode similar to LLR's INC mode.
In this phase, Codex would:
This is useful because users often cannot describe the correct goal until Codex has helped explore the space.
A goal could maintain lightweight current evidence, not just checkpoints.
Possible fields could include:
The important distinction is:
LLR's evidence-chain design has been useful because long-running tasks often produce many artifacts, but later review needs to know which facts still support the mainline and which earlier assumptions were overturned.
During an active goal, users often remember missing constraints or ask urgent side questions.
A goal workflow could help Codex distinguish whether the latest user message is:
For normal side questions, Codex should answer the user first, then resume the active goal if the main contract is unchanged.
This is one of the most useful parts of LLR in practice: the mainline keeps a compass even when the conversation temporarily walks onto a side path.
If Codex is about to stop while a goal is active, it could re-evaluate:
If the next action is clear and still covered by the active goal, Codex should continue instead of asking the user to type "continue".
This is the motivation behind LLR's hook-based stop guard. It is not meant to make Codex run forever. It is meant to prevent the common pattern where Codex states the next step itself, but still stops and waits for the user to approve continuation.
In research-heavy or fast-moving domains, Codex search can sometimes behave like answer validation: it searches for high-confidence concepts it already knows instead of discovering the domain from task-level keywords.
LLR's INC guidance asks Codex to derive discovery keywords from the user's wording, project vocabulary, file names, data labels, metrics, failure symptoms, tools, quality bar, and ecosystem terms before presenting expert defaults.
This could also be useful around
/goal, especially when a goal depends on current practice, benchmarks, standards, libraries, or domain conventions.Expected outcome
I would love to know whether the Codex team sees these LLR patterns as:
/goal/goalcould exposeThanks for adding persisted
/goalworkflows. LLR is my attempt to explore the surrounding harness that makes long-running goals easier to define, steer, review, and complete.Additional information
No response