| description | Guide for choosing the right persistent memory strategy in agentic workflows — cache-memory, repo-memory, and repo-memory with wiki. Covers deduplication, stateful baseline comparison (metrics/coverage), and stateful scanning ("alert on new X"). |
|---|
Consult this file when designing a workflow that needs to persist state across runs — deduplication, incremental processing, cross-run context, or knowledge accumulation.
⚠️ repo-memorydoes NOT mean "cache-memory". They are two distinct tools with different backends, tradeoffs, and use cases.cache-memoryis almost always the right first choice.
| Need | Use |
|---|---|
| Skip already-processed items (deduplication) | cache-memory ✅ first choice |
| Round-robin processing across runs | cache-memory ✅ first choice |
| Store ephemeral run state, analysis notes, or intermediate results | cache-memory ✅ first choice |
| Track a numeric metric and compare current vs. baseline (runs at least every 7 days) | cache-memory ✅ first choice |
| Long-lived knowledge base visible in PRs and code reviews | repo-memory |
| Baselines that must survive cache expiry (e.g. security findings, dedup lists) | repo-memory |
| Human-readable wiki pages for knowledge accumulation | repo-memory with wiki: true |
| Persist notes/state inline on the triggering issue or PR | comment-memory |
Default to cache-memory unless you have a specific reason to use repo-memory.
Before writing new persistent files, check whether GitHub and Git already expose the state you need.
| Goal | Built-in source | Caching strategy |
|---|---|---|
| Skip stale files in docs/code scans | Git history (git log / last modified commit per file) |
Cache either a single repo watermark SHA or per-file SHAs, then compare changed paths in newer commits |
| Avoid reopening known incidents | Issue/PR history (recent open + closed items by label/title prefix) | Cache only canonical identifiers (issue numbers, advisory IDs), not full issue payloads |
| Process incrementally across repo activity | PR merge history (merged_at, base branch) |
Cache the last merged PR number or merge timestamp and fetch only newer merges |
| Keep nightly triage focused | Issue timeline (updated_at, comments) |
Cache the last scan cursor (updated_at watermark) and only inspect newer updates |
| Reuse expensive relationship lookups | GitHub graph links (issue ↔ PR ↔ commit references) | Cache normalized link maps keyed by stable node IDs and refresh selectively |
- Prefer stable identifiers from GitHub graph data (
node_id, issue/PR number, commit SHA) over mutable text fields. - Persist watermarks (last seen timestamp, commit SHA, PR number) instead of full snapshots when possible.
- Use built-in history as the source of truth; use memory tools to store only incremental state needed to resume efficiently.
- If history queries are cheap and deterministic (for example, bounded to recent activity like the latest 20-100 items), recompute from GitHub/git instead of storing large derived datasets.
Uses GitHub Actions cache (actions/cache) to persist a local filesystem directory populated by the @modelcontextprotocol/server-memory MCP server. The directory lives at /tmp/gh-aw/cache-memory/.
- Deduplication: Track which items (issues, PRs, URLs, IDs) have already been processed
- Round-robin / incremental processing: Remember where you left off across scheduled runs
- Ephemeral structured state: JSON blobs, processing queues, intermediate analysis results
- Metric baseline comparison: Store a coverage %, score, or count and compare on the next run (see Stateful Analysis / Baseline Comparison below)
- Visual regression baselines: Store screenshots between PR runs (see
visual-regression.md) - Tool call caching: Avoid redundant expensive API calls across runs
tools:
cache-memory: trueAdvanced — custom key:
tools:
cache-memory:
key: dedup-${{ github.event.schedule }}-${{ github.run_id }}
retention-days: 30
allowed-extensions: [".json"]Multiple named caches:
tools:
cache-memory:
- id: processed
key: processed-items-${{ github.run_id }}
- id: results
key: results-${{ github.run_id }}
retention-days: 14- Single cache:
/tmp/gh-aw/cache-memory/ - Multiple caches:
/tmp/gh-aw/cache-memory/{id}/
The following pattern lets a scheduled workflow skip items it has already processed:
---
on:
schedule:
- cron: "0 9 * * *"
permissions:
issues: read
engine: copilot
tools:
github:
toolsets: [issues]
cache-memory: true
safe-outputs:
create-issue:
title-prefix: "[daily-digest] "
close-older-issues: true
labels: [automation]
timeout-minutes: 15
---
Fetch the 20 most recently updated open issues.
Load `/tmp/gh-aw/cache-memory/processed.json` if it exists; it contains an array of
issue numbers already included in past digests.
Skip any issue whose number already appears in that array.
Summarize the remaining (new) issues. If there are none, use the `noop` safe output.
Before finishing, write the updated full list of processed issue numbers back to
`/tmp/gh-aw/cache-memory/processed.json` using a filesystem-safe timestamp:
`YYYY-MM-DD-HH-MM-SS` (no colons, no `T`, no `Z`).Use cache-memory to persist a baseline metric between runs and detect regressions. This pattern is well-suited for any "compare current vs. previous" scenario — test coverage, build duration, benchmark scores, audit counts — where runs happen at least once every 7 days (the default cache retention).
When to use this pattern
- Tracking a numeric metric (coverage %, build time, test count, score) across scheduled or PR runs
- Alerting when a metric regresses by more than an acceptable threshold
- Any "tell me when X drops by more than Y" workflow where losing the baseline for a cycle is tolerable (the next run simply re-establishes it)
When to use repo-memory instead
If a lost baseline would cause serious side-effects — e.g. a security-finding baseline where "cache miss" floods the repo with duplicate issues — use repo-memory. See Stateful Scanning Pattern (repo-memory) below.
Worked example: coverage delta on every PR
---
description: Post a PR comment when test coverage drops by more than 1 percentage point
on:
pull_request:
types: [opened, synchronize]
permissions:
pull-requests: read
contents: read
engine: copilot
tools:
github:
toolsets: [pull_requests]
cache-memory: true
safe-outputs:
add-pr-comment:
max: 1
timeout-minutes: 15
---
Run the test suite and collect the overall line-coverage percentage as a
single float (e.g. `82.5`).
Load `/tmp/gh-aw/cache-memory/coverage-baseline.json` if it exists.
The file stores: `{ "coverage": 82.5, "updated": "2026-05-01-09-00-00" }`.
**First run** (file missing): write the current coverage to the file and use
the `noop` safe output — no comment is needed yet.
**Subsequent runs** (baseline found): compute `delta = current − baseline`.
- If `delta >= −1.0` (coverage held or improved), use the `noop` safe output.
- If `delta < −1.0` (coverage fell by more than 1 pp), post an `add-pr-comment`
that includes:
- Baseline coverage, current coverage, and delta (e.g. "82.5% → 79.3% (−3.2 pp)")
- Which files lost the most coverage
Regardless of the outcome, overwrite `/tmp/gh-aw/cache-memory/coverage-baseline.json`
with the current coverage and a filesystem-safe timestamp `YYYY-MM-DD-HH-MM-SS`
(no colons, no `T`, no `Z`).Key design decisions
cache-memorynotrepo-memory— coverage deltas are short-lived quality gates; a cache miss just means "no comparison this run" and the baseline is silently refreshed — no false-positive flood- First-run handling — treat a missing baseline as "no data yet": write it and skip the comparison; the second run is the first real gate
- Threshold guard — ignore sub-1 pp fluctuations to reduce noise; tune the threshold to your team's standards
- Filename safety — use
YYYY-MM-DD-HH-MM-SS(no colons) in any timestamped filenames written tocache-memory; see Filename safety below
| ✅ Pros | ❌ Cons |
|---|---|
| Zero repository noise — no commits, no PRs | Evicted when cache expires (default 7 days; use retention-days to extend up to 90) |
| Fast: no Git operations required | Not human-readable in GitHub UI |
| Works with Copilot, Claude, and custom engines | Data loss if cache is invalidated or expires |
| Supports multiple isolated caches per workflow | Files are uploaded as GitHub Actions artifacts — no colons in filenames |
| Scoped to workflow by default |
Cache-memory files are uploaded as GitHub Actions artifacts. Artifact filenames must not contain colons (NTFS limitation on Windows-hosted runners).
# ✅ GOOD — filesystem-safe timestamp
/tmp/gh-aw/cache-memory/state-2026-02-12-11-20-45.json
# ❌ BAD — colon in timestamp breaks artifact upload
/tmp/gh-aw/cache-memory/state-2026-02-12T11:20:45Z.jsonWhen instructing the agent to write timestamped files, say explicitly:
"Use filesystem-safe timestamp format
YYYY-MM-DD-HH-MM-SS(no colons, noT, noZ)."
Uses a dedicated Git branch (default: memory/agent-notes) to store files that persist indefinitely until explicitly deleted. The directory lives at /tmp/gh-aw/repo-memory/.
- The knowledge needs to survive cache expiration
- You want the memory to be visible in the repository (auditable via Git history)
- The workflow accumulates a knowledge base that grows over time (e.g., architecture notes, known issues)
- You need changes to appear in diffs and be reviewable
tools:
repo-memory:
branch-name: memory/agent-notes # Optional: custom branch name
target-repo: owner/other-repo # Optional: store in another repo
allowed-extensions: [".json", ".md"]
max-file-size: 10240 # bytes
max-file-count: 100The compiler automatically creates a separate push_repo_memory job with contents: write permission. The main agent job retains read-only permissions.
| ✅ Pros | ❌ Cons |
|---|---|
| Persists indefinitely (no expiry) | Produces Git commits — repository noise |
| Auditable: Git history shows every change | Produces Git commits — repository noise |
| Survives cache invalidation | Slower: requires Git clone + push |
| Human-readable via GitHub branch UI | Not available for Copilot engine (requires GitHub tools) |
| Can target a different repository | More complex setup |
A variant of repo-memory that stores files in the GitHub Wiki (a separate Git repository at <repo>.wiki.git) instead of a branch.
- You want structured, human-readable documentation pages
- The knowledge is intended for human consumption (wikis are browsable)
- You're building a living knowledge base or FAQ
tools:
repo-memory:
wiki: true
allowed-extensions: [".md"]The compiler automatically creates a separate push_repo_memory job with contents: write permission. The main agent job retains read-only permissions.
Files follow GitHub Wiki Markdown conventions: use [[Page Name]] syntax for internal links, name files with hyphens instead of spaces.
| ✅ Pros | ❌ Cons |
|---|---|
| Browsable in the GitHub Wiki UI | Produces Git commits to wiki repo |
| Great for human-readable knowledge bases | Produces Git commits to wiki repo |
| Standard Markdown with wiki link syntax | Restricted to .md files in practice |
| Separate from main repo history | Less suitable for structured JSON state |
Uses a dedicated <gh-aw-comment-memory> XML block in an issue or PR comment as persistent memory. The agent edits plain markdown files under /tmp/gh-aw/comment-memory/; the safe-output processor syncs the changes back to the managed comment.
- Persist workflow notes or statuses visible inline on the triggering issue or PR
- State tied to the lifecycle of a specific issue or PR
- Structured running track records (status tables, checklists, summaries) the team can read without leaving the issue
Do NOT use comment-memory for high-volume ephemeral state (use cache-memory), long-lived knowledge bases (use repo-memory), or data that must survive across issues/PRs.
tools:
comment-memory: true # enable with defaultsAdvanced:
tools:
comment-memory:
memory-id: status # Optional: identifier in XML marker (default: "default")
target: triggering # Optional: "triggering" (default), "*", or explicit number
target-repo: owner/other # Optional: cross-repository
max: 1 # Optional: max updates per run (default: 1)
footer: false # Optional: omit AI-generated footer (default: true)- Pre-agent setup: Reads
<gh-aw-comment-memory id="<memory-id>">from the target comment and writes content to/tmp/gh-aw/comment-memory/<memory_id>.md. - Agent: Edits the markdown file directly — no explicit safe-output tool call needed.
- Post-agent: The safe-output processor reads the edited file and upserts the managed comment, replacing only the XML-fenced block.
Multiple memory IDs are supported in a single comment; each maps to a separate *.md file.
| ✅ Pros | ❌ Cons |
|---|---|
| Visible in GitHub UI inline on the issue/PR | Requires issues:write or pull-requests:write |
| No separate branch or cache | One comment block per memory-id per target |
| Agent edits plain markdown — no tool call needed | Not suited for large structured data |
| Tied to issue/PR lifecycle | Not available without a triggering issue or PR |
Use repo-memory to persist a baseline JSON file between scheduled runs so that the workflow only alerts on new findings — vulnerability scans, dependency audits, licence checks, or any "track changes over time" scenario.
---
description: Nightly npm vulnerability scan — alerts only on new advisories
on:
schedule:
- cron: "0 2 * * *"
permissions:
issues: write
contents: read
engine: claude
tools:
repo-memory:
allowed-extensions: [".json"]
network:
allowed:
- registry.npmjs.org
safe-outputs:
create-issue:
title-prefix: "[vuln] "
labels: [security, automated]
max: 5
timeout-minutes: 20
---
Load `/tmp/gh-aw/repo-memory/default/vuln-baseline.json`.
If missing, treat the baseline as `[]` (first run).
Run `npm audit --json`. Collect each advisory's id, severity, title, and URL.
Diff against the baseline:
- **New** (in current, not in baseline) → open a `create-issue` per finding (max 5).
- **Resolved** (in baseline, not in current) → log only.
- If no new findings, use the `noop` safe output.
Write the current advisory IDs to `/tmp/gh-aw/repo-memory/default/vuln-baseline.json` as a JSON array.repo-memoryfor baselines, notcache-memory— caches expire after 7 days; a lost baseline makes every known finding appear "new" on the next run, flooding the repo with duplicate issues- First-run handling — treat a missing baseline file as
[]and write it at the end of the first run, giving subsequent runs a clean starting point max:flood guard — caps issues opened per run; usemax: 5for nightly scans,max: 1for secret alerts,max: 10for weekly audits- Engine restriction —
repo-memoryrequires Claude or a custom engine; it is not available for the Copilot engine - Baseline schema — store only stable identifiers (advisory ID strings), not mutable fields like severity, to avoid false "new" alerts when metadata changes
| Feature | cache-memory |
repo-memory |
repo-memory + wiki |
comment-memory |
|---|---|---|---|---|
| First choice | ✅ Yes | No | No | No |
| Storage backend | GitHub Actions cache | Git branch | GitHub Wiki | Issue/PR comment |
| Persistence | Up to 90 days | Indefinite | Indefinite | Issue/PR lifetime |
Compiler adds contents: write |
No | Yes (push job) | Yes (push job) | No |
| Repository noise | None | Git commits | Wiki commits | Comment updates |
| Human-readable in GitHub | No | Via branch UI | Via Wiki UI | ✅ Inline on issue/PR |
| Structured data (JSON) | ✅ Ideal | Possible | Not recommended | Not recommended |
| Filename restrictions | No colons in names | None | Hyphens for spaces | None |
| Engine compatibility | Copilot, Claude, custom | Claude, custom | Claude, custom | Claude, custom |
- ❌ Do not invent
repo-memoryas a synonym forcache-memory— they are different tools - ❌ Do not use
repo-memoryfor ephemeral per-run state — usecache-memory - ❌ Do not use
cache-memorywhen you need indefinite persistence — userepo-memory - ❌ Do not include colons in cache-memory filenames — artifact upload will fail