Preflight Checklist
What happened?
When heartbeat is enabled and the heartbeat agent turn invokes exec (directly or via a skill that shells out), the heartbeat re-fires every few seconds in a tight self-reinforcing loop instead of respecting its configured interval (e.g. 30m).
Observed: ~20 heartbeat runs in ~30 minutes. The __heartbeat__ cron job shows lastStatus: "ok" for every run but the nextRunAtMs keeps advancing to "now".
Expected behavior
Heartbeat fires once per configured interval. exec calls made within the heartbeat turn should not cause the heartbeat to re-schedule itself immediately.
Steps to reproduce
- Enable heartbeat in
moltis.toml ([heartbeat] enabled = true)
- Set a heartbeat prompt or
HEARTBEAT.md that instructs the agent to run diagnostic shell commands (e.g. check service status, run a collection script)
- Observe the heartbeat cron job firing repeatedly — every few seconds to minutes — instead of at the configured interval
Root cause
A feedback loop between the exec completion callback and the heartbeat wake mechanism:
- Heartbeat fires → agent starts LLM turn
- Agent calls
exec (any shell command)
exec completes → ExecCompletionFn fires (crates/gateway/src/server/prepare_core/post_state.rs:622-629)
- Callback enqueues a system event and unconditionally calls
cs.wake("exec-event")
CronService::wake() (crates/cron/src/service.rs:227-238) sets next_run_at_ms = now on the __heartbeat__ job
- Current heartbeat run finishes →
running_at_ms cleared
- Timer loop sees
next_run_at_ms <= now and running_at_ms == None → fires heartbeat again
- Goto step 2
The running_at_ms guard in wake() prevents firing during a run, but the next_run_at_ms = now assignment persists and takes effect the instant the run completes.
Additionally, there is no moltis.toml config option to disable exec-completion-triggered heartbeat wakes.
Did this happen during a chat session?
Yes
Chat session context (if applicable)
Custom __heartbeat__ job with sessionTarget: { named: "heartbeat" } and an agentTurn payload that runs diagnostic commands via exec.
Error messages / logs
No errors — every run succeeds with lastStatus: "ok". The issue is the excessive frequency.
Is this a regression?
No — this is a long-standing architectural gap.
Moltis version
Built from source (current main)
Component
Core / Gateway, Cron scheduler
Install method
Built from source
Operating system
Debian 12 (bookworm) / Ubuntu host
Proposed fixes
Fix 1: Session-aware wake filter (recommended)
In post_state.rs, skip cs.wake() when the exec completion originates from the heartbeat session (cron:heartbeat). The exec callback already has access to the session context — check whether the session key matches the heartbeat session and, if so, only enqueue the event without calling wake().
// In ExecCompletionFn callback (post_state.rs:622-629):
let is_heartbeat_session = event.session_key.as_deref() == Some("cron:heartbeat");
tokio::spawn(async move {
eq.enqueue(summary, "exec-event".into()).await;
if !is_heartbeat_session {
cs.wake("exec-event").await;
}
});
Fix 2: Heartbeat wake cooldown (debounce)
Add a configurable minimum cooldown between heartbeat wakes. Even if wake() is called, ignore it if the heartbeat completed less than N minutes ago. This is a broader defense that protects against any source of rapid re-waking.
// In HeartbeatConfig:
pub wake_cooldown_secs: Option<u64>, // e.g. 300 = 5 minutes
// In CronService::wake():
// Skip if last completed less than cooldown_secs ago
if let Some(cooldown_ms) = wake_cooldown_ms {
if let Some(last_run) = job.state.last_run_at_ms {
if now.saturating_sub(last_run) < cooldown_ms {
return; // too soon
}
}
}
Both fixes can coexist. Fix 1 is the targeted solution; Fix 2 is a safety net.
Additional context
This was previously worked around by running heartbeat-loop.sh as a no-LLM background process that writes HEARTBEAT.md, then having the heartbeat LLM only read the file (no exec). However, any heartbeat prompt that triggers exec will reintroduce the loop.
Preflight Checklist
What happened?
When heartbeat is enabled and the heartbeat agent turn invokes
exec(directly or via a skill that shells out), the heartbeat re-fires every few seconds in a tight self-reinforcing loop instead of respecting its configured interval (e.g. 30m).Observed: ~20 heartbeat runs in ~30 minutes. The
__heartbeat__cron job showslastStatus: "ok"for every run but thenextRunAtMskeeps advancing to "now".Expected behavior
Heartbeat fires once per configured interval.
execcalls made within the heartbeat turn should not cause the heartbeat to re-schedule itself immediately.Steps to reproduce
moltis.toml([heartbeat] enabled = true)HEARTBEAT.mdthat instructs the agent to run diagnostic shell commands (e.g. check service status, run a collection script)Root cause
A feedback loop between the exec completion callback and the heartbeat wake mechanism:
exec(any shell command)execcompletes →ExecCompletionFnfires (crates/gateway/src/server/prepare_core/post_state.rs:622-629)cs.wake("exec-event")CronService::wake()(crates/cron/src/service.rs:227-238) setsnext_run_at_ms = nowon the__heartbeat__jobrunning_at_msclearednext_run_at_ms <= nowandrunning_at_ms == None→ fires heartbeat againThe
running_at_msguard inwake()prevents firing during a run, but thenext_run_at_ms = nowassignment persists and takes effect the instant the run completes.Additionally, there is no moltis.toml config option to disable exec-completion-triggered heartbeat wakes.
Did this happen during a chat session?
Yes
Chat session context (if applicable)
Custom
__heartbeat__job withsessionTarget: { named: "heartbeat" }and anagentTurnpayload that runs diagnostic commands viaexec.Error messages / logs
No errors — every run succeeds with
lastStatus: "ok". The issue is the excessive frequency.Is this a regression?
No — this is a long-standing architectural gap.
Moltis version
Built from source (current main)
Component
Core / Gateway, Cron scheduler
Install method
Built from source
Operating system
Debian 12 (bookworm) / Ubuntu host
Proposed fixes
Fix 1: Session-aware wake filter (recommended)
In
post_state.rs, skipcs.wake()when the exec completion originates from the heartbeat session (cron:heartbeat). The exec callback already has access to the session context — check whether the session key matches the heartbeat session and, if so, only enqueue the event without callingwake().Fix 2: Heartbeat wake cooldown (debounce)
Add a configurable minimum cooldown between heartbeat wakes. Even if
wake()is called, ignore it if the heartbeat completed less than N minutes ago. This is a broader defense that protects against any source of rapid re-waking.Both fixes can coexist. Fix 1 is the targeted solution; Fix 2 is a safety net.
Additional context
This was previously worked around by running
heartbeat-loop.shas a no-LLM background process that writesHEARTBEAT.md, then having the heartbeat LLM only read the file (noexec). However, any heartbeat prompt that triggersexecwill reintroduce the loop.