References

Compiled from ## References sections across docs, plus README and ongoing research backlog entries.

Projects & Inspiration

Pi — The best open-source harness, huge source of inspiration. (README)
oh-my-pi — An awesome customization of Pi with so many goodies and tricks. (README)

Edit Formats & Benchmarks

The Harness Problem — Can Bölük, 2026. Benchmark of edit formats across 16 models showing hashline outperforms str_replace and patch. (edit)
oh-my-pi react-edit-benchmark — Benchmark code and per-run reports. (edit)
Diff-XYZ benchmark — JetBrains. No single edit format dominates across models and use cases. (edit)
EDIT-Bench — Only one model achieves over 60% pass@1 on realistic editing tasks. (edit)
Aider benchmarks — Format choice swung GPT-4 Turbo from 26% to 59%. (edit)
Cursor Instant Apply — Fine-tuned 70B model for edit application; full rewrite outperforms diffs for files under 400 lines. (edit)

OTP & Erlang

Erlang gen_statem — OTP state machine behaviour used by Opal.Agent. (agent-loop)
Elixir GenServer — Messaging model still used by sibling subsystems and APIs around the loop. (agent-loop)
Erlang/OTP Supervisor Principles — Supervision strategy used by session-local processes and tool tasks. (agent-loop)
Erlang Distribution Protocol — Official docs covering node naming, cookies, and EPMD. (erlang)
Erlang Distribution Security Guide — How to enable TLS for inter-node traffic. (erlang)

LLM Providers & Models

LLMDB — Model database powering auto-discovery of models, context windows, and capabilities. (providers)

Reasoning & Thinking

OpenAI Reasoning Guide — Official docs for reasoning.effort and reasoning.summary parameters on the Responses API. (reasoning)
Anthropic Extended Thinking — Official docs for budget-based and adaptive thinking modes, including output_config.effort levels. (reasoning)
opencode#6864 — Confirms the Copilot proxy does not return reasoning_content for Claude models. Other tools experience the same limitation. (reasoning)

User Interaction & Planning

Handle approvals and user input — Anthropic, 2025. Claude Agent SDK documentation for surfacing approval requests and clarifying questions. Informed the ask_user tool design and the planning approach. (user-input, planning)

Schema & Validation

Zod JSON Schema — Zod 4's built-in z.fromJsonSchema() for deriving Zod schemas (with full type inference) from JSON Schema at runtime. (rpc)

Auth & Standards

RFC 8628 — OAuth 2.0 Device Authorization Grant — GitHub device-code OAuth flow used by Opal. (installing)

Context Files & Agent Instructions

Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents? — Gloaguen et al., 2026. Finds that AGENTS.md context files tend to reduce task success rates while increasing inference cost by 20%+; recommends minimal requirements only. (arxiv)

CI & Distribution

GitHub Actions: macOS 13 runner image is closing down — macos-13 retired Dec 2025. Use macos-15-intel for x86_64 builds. Intel macOS support ends Fall 2027 when macos-15 image retires. (GitHub changelog)

TODO

Papers and resources to review and potentially integrate:

LCM: Lossless Context Management — "We introduce Lossless Context Management (LCM), a deterministic architecture for LLM memory that outperforms Claude Code on long-context tasks. When benchmarked using Opus 4.6, our LCM-augmented coding agent, Volt, achieves higher scores than Claude Code on the OOLONG long-context eval, including at every context length between 32K and 1M tokens."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

References

Projects & Inspiration

Edit Formats & Benchmarks

OTP & Erlang

LLM Providers & Models

Reasoning & Thinking

User Interaction & Planning

Schema & Validation

Auth & Standards

Context Files & Agent Instructions

CI & Distribution

TODO

FilesExpand file tree

references.md

Latest commit

History

references.md

File metadata and controls

References

Projects & Inspiration

Edit Formats & Benchmarks

OTP & Erlang

LLM Providers & Models

Reasoning & Thinking

User Interaction & Planning

Schema & Validation

Auth & Standards

Context Files & Agent Instructions

CI & Distribution

TODO