Skip to content

bug: make check in formula steps exceeds default Bash timeout; override instructions unreliable #509

@rileywhite

Description

@rileywhite

Gas City version

0.13.5 (commit: 98a429e, built: 2026-04-08T22:44:10Z)

Environment

Ubuntu, bash, tmux provider. Claude Code agent runtime (Claude Opus / Sonnet).

Reproduction

  1. Create a formula with a step that instructs the agent to run make check:
[[steps]]
id = "quality-gates"
title = "Run quality gates"
description = """
Run make check with an explicit 10-minute timeout:
\```
Bash(command="make check 2>&1", timeout=600000)
\```
"""
  1. Dispatch to a pool agent via gc sling.

  2. Observe the agent's behavior when it reaches the quality gate step. Two failure modes:

    Mode A — default timeout: The agent runs make check in a normal Bash call without setting timeout=600000. The default 2-minute timeout kills the process with SIGTERM (exit code 144). The agent retries, hits the same timeout, and crash-loops.

    Mode B — commented out: The agent echoes the tool invocation syntax as a bash comment (# Bash(command="make check 2>&1", timeout=600000)) rather than translating it into an actual tool call with the timeout parameter.

  3. make check for this repo runs go test ./... which takes 3+ minutes — reliably exceeds the 2-minute default.

Expected behavior

Formula step descriptions should be able to reliably instruct agents to run long-running commands with appropriate timeouts. make check is the standard quality gate per CONTRIBUTING.md and should be runnable from formula-driven workflows.

Actual behavior

There is no reliable way to override the Bash tool timeout from within a formula step description. The Bash(command=..., timeout=...) syntax in step descriptions is either:

  • Ignored (agent uses default 2-minute timeout)
  • Treated as literal text / commented out

This makes it impossible to create formula-driven workflows that follow the project's own contributing guidelines requiring make check before push.

Logs, screenshots, or traces

From polecat session logs, the agent's interpretation of the formula instruction:

# 1. Run quality gates — MUST use timeout: 600000 or SIGTERM will crash you
# Bash(command="make check 2>&1", timeout=600000)
# Bash(command="make check-docs 2>&1", timeout=600000)   # if docs changed

The tool invocation syntax is rendered as bash comments, not executed as tool calls.

Additional context

The formula (mol-polecat-work) included increasingly emphatic warnings about the timeout:

  • "CRITICAL — Bash timeout"
  • "NEVER use default timeout"
  • "you will crash-loop"
  • Bold text, tables, repeated instructions

None of these reliably caused the agent to use the correct timeout. The fundamental issue is that formula step descriptions are natural language processed by an LLM, so tool invocation parameters like timeout cannot be guaranteed.

Possible directions:

  • A formula-level mechanism to set default Bash timeout for all commands in a step (e.g., [step.tool_defaults] in the formula schema)
  • A project-level config that raises the default Bash timeout for agents working in this repo
  • Making make check faster (under 2 minutes) so the default timeout suffices

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugBroken behaviorpriority/p2Medium — real problem, workaround existsstatus/acceptedConfirmed and on our radarstatus/in-progressActively being worked

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions