description	Guide for choosing the right persistent memory strategy in agentic workflows — cache-memory, repo-memory, and repo-memory with wiki. Covers deduplication, stateful baseline comparison (metrics/coverage), and stateful scanning ("alert on new X").

Persistent Memory in Agentic Workflows

Consult this file when designing a workflow that needs to persist state across runs — deduplication, incremental processing, cross-run context, or knowledge accumulation.

⚠️ repo-memory does NOT mean "cache-memory". They are two distinct tools with different backends, tradeoffs, and use cases. cache-memory is almost always the right first choice.

Quick Decision Guide

Need	Use
Skip already-processed items (deduplication)	`cache-memory` ✅ first choice
Round-robin processing across runs	`cache-memory` ✅ first choice
Store ephemeral run state, analysis notes, or intermediate results	`cache-memory` ✅ first choice
Track a numeric metric and compare current vs. baseline (runs at least every 7 days)	`cache-memory` ✅ first choice
Long-lived knowledge base visible in PRs and code reviews	`repo-memory`
Baselines that must survive cache expiry (e.g. security findings, dedup lists)	`repo-memory`
Human-readable wiki pages for knowledge accumulation	`repo-memory` with `wiki: true`
Persist notes/state inline on the triggering issue or PR	`comment-memory`

Default to cache-memory unless you have a specific reason to use repo-memory.

Built-in Memory via GitHub Graph and Git History

Before writing new persistent files, check whether GitHub and Git already expose the state you need.

Practical strategies

Goal	Built-in source	Caching strategy
Skip stale files in docs/code scans	Git history (`git log` / last modified commit per file)	Cache either a single repo watermark SHA or per-file SHAs, then compare changed paths in newer commits
Avoid reopening known incidents	Issue/PR history (recent open + closed items by label/title prefix)	Cache only canonical identifiers (issue numbers, advisory IDs), not full issue payloads
Process incrementally across repo activity	PR merge history (`merged_at`, base branch)	Cache the last merged PR number or merge timestamp and fetch only newer merges
Keep nightly triage focused	Issue timeline (`updated_at`, comments)	Cache the last scan cursor (`updated_at` watermark) and only inspect newer updates
Reuse expensive relationship lookups	GitHub graph links (issue ↔ PR ↔ commit references)	Cache normalized link maps keyed by stable node IDs and refresh selectively

Design guidance

Prefer stable identifiers from GitHub graph data (node_id, issue/PR number, commit SHA) over mutable text fields.
Persist watermarks (last seen timestamp, commit SHA, PR number) instead of full snapshots when possible.
Use built-in history as the source of truth; use memory tools to store only incremental state needed to resume efficiently.
If history queries are cheap and deterministic (for example, bounded to recent activity like the latest 20-100 items), recompute from GitHub/git instead of storing large derived datasets.

`cache-memory` — First Choice

Uses GitHub Actions cache (actions/cache) to persist a local filesystem directory populated by the @modelcontextprotocol/server-memory MCP server. The directory lives at /tmp/gh-aw/cache-memory/.

When to use

Deduplication: Track which items (issues, PRs, URLs, IDs) have already been processed
Round-robin / incremental processing: Remember where you left off across scheduled runs
Ephemeral structured state: JSON blobs, processing queues, intermediate analysis results
Metric baseline comparison: Store a coverage %, score, or count and compare on the next run (see Stateful Analysis / Baseline Comparison below)
Visual regression baselines: Store screenshots between PR runs (see visual-regression.md)
Tool call caching: Avoid redundant expensive API calls across runs

Configuration

tools:
  cache-memory: true

Advanced — custom key:

tools:
  cache-memory:
    key: dedup-${{ github.event.schedule }}-${{ github.run_id }}
    retention-days: 30
    allowed-extensions: [".json"]

Multiple named caches:

tools:
  cache-memory:
    - id: processed
      key: processed-items-${{ github.run_id }}
    - id: results
      key: results-${{ github.run_id }}
      retention-days: 14

Storage path

Single cache: /tmp/gh-aw/cache-memory/
Multiple caches: /tmp/gh-aw/cache-memory/{id}/

Deduplication example (scheduled workflow)

The following pattern lets a scheduled workflow skip items it has already processed:

---
on:
  schedule:
    - cron: "0 9 * * *"
permissions:
  issues: read
engine: copilot
tools:
  github:
    toolsets: [issues]
  cache-memory: true
safe-outputs:
  create-issue:
    title-prefix: "[daily-digest] "
    close-older-issues: true
    labels: [automation]
timeout-minutes: 15
---

Fetch the 20 most recently updated open issues.

Load `/tmp/gh-aw/cache-memory/processed.json` if it exists; it contains an array of
issue numbers already included in past digests.

Skip any issue whose number already appears in that array.

Summarize the remaining (new) issues. If there are none, use the `noop` safe output.

Before finishing, write the updated full list of processed issue numbers back to
`/tmp/gh-aw/cache-memory/processed.json` using a filesystem-safe timestamp:
`YYYY-MM-DD-HH-MM-SS` (no colons, no `T`, no `Z`).

Stateful Analysis / Baseline Comparison

Use cache-memory to persist a baseline metric between runs and detect regressions. This pattern is well-suited for any "compare current vs. previous" scenario — test coverage, build duration, benchmark scores, audit counts — where runs happen at least once every 7 days (the default cache retention).

When to use this pattern

Tracking a numeric metric (coverage %, build time, test count, score) across scheduled or PR runs
Alerting when a metric regresses by more than an acceptable threshold
Any "tell me when X drops by more than Y" workflow where losing the baseline for a cycle is tolerable (the next run simply re-establishes it)

When to use repo-memory instead

If a lost baseline would cause serious side-effects — e.g. a security-finding baseline where "cache miss" floods the repo with duplicate issues — use repo-memory. See Stateful Scanning Pattern (repo-memory) below.

Worked example: coverage delta on every PR

---
description: Post a PR comment when test coverage drops by more than 1 percentage point
on:
  pull_request:
    types: [opened, synchronize]
permissions:
  pull-requests: read
  contents: read
engine: copilot
tools:
  github:
    toolsets: [pull_requests]
  cache-memory: true
safe-outputs:
  add-pr-comment:
    max: 1
timeout-minutes: 15
---

Run the test suite and collect the overall line-coverage percentage as a
single float (e.g. `82.5`).

Load `/tmp/gh-aw/cache-memory/coverage-baseline.json` if it exists.
The file stores: `{ "coverage": 82.5, "updated": "2026-05-01-09-00-00" }`.

**First run** (file missing): write the current coverage to the file and use
the `noop` safe output — no comment is needed yet.

**Subsequent runs** (baseline found): compute `delta = current − baseline`.

- If `delta >= −1.0` (coverage held or improved), use the `noop` safe output.
- If `delta < −1.0` (coverage fell by more than 1 pp), post an `add-pr-comment`
  that includes:
  - Baseline coverage, current coverage, and delta (e.g. "82.5% → 79.3% (−3.2 pp)")
  - Which files lost the most coverage

Regardless of the outcome, overwrite `/tmp/gh-aw/cache-memory/coverage-baseline.json`
with the current coverage and a filesystem-safe timestamp `YYYY-MM-DD-HH-MM-SS`
(no colons, no `T`, no `Z`).

Key design decisions

cache-memory not repo-memory — coverage deltas are short-lived quality gates; a cache miss just means "no comparison this run" and the baseline is silently refreshed — no false-positive flood
First-run handling — treat a missing baseline as "no data yet": write it and skip the comparison; the second run is the first real gate
Threshold guard — ignore sub-1 pp fluctuations to reduce noise; tune the threshold to your team's standards
Filename safety — use YYYY-MM-DD-HH-MM-SS (no colons) in any timestamped filenames written to cache-memory; see Filename safety below

Tradeoffs

✅ Pros	❌ Cons
Zero repository noise — no commits, no PRs	Evicted when cache expires (default 7 days; use `retention-days` to extend up to 90)
Fast: no Git operations required	Not human-readable in GitHub UI
Works with Copilot, Claude, and custom engines	Data loss if cache is invalidated or expires
Supports multiple isolated caches per workflow	Files are uploaded as GitHub Actions artifacts — no colons in filenames
Scoped to workflow by default

Filename safety

Cache-memory files are uploaded as GitHub Actions artifacts. Artifact filenames must not contain colons (NTFS limitation on Windows-hosted runners).

# ✅ GOOD — filesystem-safe timestamp
/tmp/gh-aw/cache-memory/state-2026-02-12-11-20-45.json

# ❌ BAD — colon in timestamp breaks artifact upload
/tmp/gh-aw/cache-memory/state-2026-02-12T11:20:45Z.json

When instructing the agent to write timestamped files, say explicitly:

"Use filesystem-safe timestamp format YYYY-MM-DD-HH-MM-SS (no colons, no T, no Z)."

`repo-memory` — Long-lived Repository Knowledge

Uses a dedicated Git branch (default: memory/agent-notes) to store files that persist indefinitely until explicitly deleted. The directory lives at /tmp/gh-aw/repo-memory/.

When to use

The knowledge needs to survive cache expiration
You want the memory to be visible in the repository (auditable via Git history)
The workflow accumulates a knowledge base that grows over time (e.g., architecture notes, known issues)
You need changes to appear in diffs and be reviewable

Configuration

tools:
  repo-memory:
    branch-name: memory/agent-notes   # Optional: custom branch name
    target-repo: owner/other-repo     # Optional: store in another repo
    allowed-extensions: [".json", ".md"]
    max-file-size: 10240              # bytes
    max-file-count: 100

The compiler automatically creates a separate push_repo_memory job with contents: write permission. The main agent job retains read-only permissions.

Tradeoffs

✅ Pros	❌ Cons
Persists indefinitely (no expiry)	Produces Git commits — repository noise
Auditable: Git history shows every change	Produces Git commits — repository noise
Survives cache invalidation	Slower: requires Git clone + push
Human-readable via GitHub branch UI	Not available for Copilot engine (requires GitHub tools)
Can target a different repository	More complex setup

`repo-memory` with `wiki: true` — GitHub Wiki Backend

A variant of repo-memory that stores files in the GitHub Wiki (a separate Git repository at <repo>.wiki.git) instead of a branch.

When to use

You want structured, human-readable documentation pages
The knowledge is intended for human consumption (wikis are browsable)
You're building a living knowledge base or FAQ

Configuration

tools:
  repo-memory:
    wiki: true
    allowed-extensions: [".md"]

The compiler automatically creates a separate push_repo_memory job with contents: write permission. The main agent job retains read-only permissions.

Files follow GitHub Wiki Markdown conventions: use [[Page Name]] syntax for internal links, name files with hyphens instead of spaces.

Tradeoffs

✅ Pros	❌ Cons
Browsable in the GitHub Wiki UI	Produces Git commits to wiki repo
Great for human-readable knowledge bases	Produces Git commits to wiki repo
Standard Markdown with wiki link syntax	Restricted to `.md` files in practice
Separate from main repo history	Less suitable for structured JSON state

`comment-memory` — Managed Comment Persistence

Uses a dedicated <gh-aw-comment-memory> XML block in an issue or PR comment as persistent memory. The agent edits plain markdown files under /tmp/gh-aw/comment-memory/; the safe-output processor syncs the changes back to the managed comment.

When to use

Persist workflow notes or statuses visible inline on the triggering issue or PR
State tied to the lifecycle of a specific issue or PR
Structured running track records (status tables, checklists, summaries) the team can read without leaving the issue

Do NOT use comment-memory for high-volume ephemeral state (use cache-memory), long-lived knowledge bases (use repo-memory), or data that must survive across issues/PRs.

Configuration

tools:
  comment-memory: true   # enable with defaults

Advanced:

tools:
  comment-memory:
    memory-id: status          # Optional: identifier in XML marker (default: "default")
    target: triggering         # Optional: "triggering" (default), "*", or explicit number
    target-repo: owner/other   # Optional: cross-repository
    max: 1                     # Optional: max updates per run (default: 1)
    footer: false              # Optional: omit AI-generated footer (default: true)

How it works

Pre-agent setup: Reads <gh-aw-comment-memory id="<memory-id>"> from the target comment and writes content to /tmp/gh-aw/comment-memory/<memory_id>.md.
Agent: Edits the markdown file directly — no explicit safe-output tool call needed.
Post-agent: The safe-output processor reads the edited file and upserts the managed comment, replacing only the XML-fenced block.

Multiple memory IDs are supported in a single comment; each maps to a separate *.md file.

Tradeoffs

✅ Pros	❌ Cons
Visible in GitHub UI inline on the issue/PR	Requires `issues:write` or `pull-requests:write`
No separate branch or cache	One comment block per `memory-id` per target
Agent edits plain markdown — no tool call needed	Not suited for large structured data
Tied to issue/PR lifecycle	Not available without a triggering issue or PR

Stateful Scanning Pattern (repo-memory)

Use repo-memory to persist a baseline JSON file between scheduled runs so that the workflow only alerts on new findings — vulnerability scans, dependency audits, licence checks, or any "track changes over time" scenario.

Example Workflow

---
description: Nightly npm vulnerability scan — alerts only on new advisories
on:
  schedule:
    - cron: "0 2 * * *"
permissions:
  issues: write
  contents: read
engine: claude
tools:
  repo-memory:
    allowed-extensions: [".json"]
network:
  allowed:
    - registry.npmjs.org
safe-outputs:
  create-issue:
    title-prefix: "[vuln] "
    labels: [security, automated]
    max: 5
timeout-minutes: 20
---

Load `/tmp/gh-aw/repo-memory/default/vuln-baseline.json`.
If missing, treat the baseline as `[]` (first run).

Run `npm audit --json`. Collect each advisory's id, severity, title, and URL.

Diff against the baseline:
- **New** (in current, not in baseline) → open a `create-issue` per finding (max 5).
- **Resolved** (in baseline, not in current) → log only.
- If no new findings, use the `noop` safe output.

Write the current advisory IDs to `/tmp/gh-aw/repo-memory/default/vuln-baseline.json` as a JSON array.

Key Design Decisions

repo-memory for baselines, not cache-memory — caches expire after 7 days; a lost baseline makes every known finding appear "new" on the next run, flooding the repo with duplicate issues
First-run handling — treat a missing baseline file as [] and write it at the end of the first run, giving subsequent runs a clean starting point
max: flood guard — caps issues opened per run; use max: 5 for nightly scans, max: 1 for secret alerts, max: 10 for weekly audits
Engine restriction — repo-memory requires Claude or a custom engine; it is not available for the Copilot engine
Baseline schema — store only stable identifiers (advisory ID strings), not mutable fields like severity, to avoid false "new" alerts when metadata changes

Summary Comparison

Feature	`cache-memory`	`repo-memory`	`repo-memory` + wiki	`comment-memory`
First choice	✅ Yes	No	No	No
Storage backend	GitHub Actions cache	Git branch	GitHub Wiki	Issue/PR comment
Persistence	Up to 90 days	Indefinite	Indefinite	Issue/PR lifetime
Compiler adds `contents: write`	No	Yes (push job)	Yes (push job)	No
Repository noise	None	Git commits	Wiki commits	Comment updates
Human-readable in GitHub	No	Via branch UI	Via Wiki UI	✅ Inline on issue/PR
Structured data (JSON)	✅ Ideal	Possible	Not recommended	Not recommended
Filename restrictions	No colons in names	None	Hyphens for spaces	None
Engine compatibility	Copilot, Claude, custom	Claude, custom	Claude, custom	Claude, custom

Anti-patterns

❌ Do not invent repo-memory as a synonym for cache-memory — they are different tools
❌ Do not use repo-memory for ephemeral per-run state — use cache-memory
❌ Do not use cache-memory when you need indefinite persistence — use repo-memory
❌ Do not include colons in cache-memory filenames — artifact upload will fail

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Persistent Memory in Agentic Workflows

Quick Decision Guide

Built-in Memory via GitHub Graph and Git History

Practical strategies

Design guidance

`cache-memory` — First Choice

When to use

Configuration

Storage path

Deduplication example (scheduled workflow)

Stateful Analysis / Baseline Comparison

Tradeoffs

Filename safety

`repo-memory` — Long-lived Repository Knowledge

When to use

Configuration

Tradeoffs

`repo-memory` with `wiki: true` — GitHub Wiki Backend

When to use

Configuration

Tradeoffs

`comment-memory` — Managed Comment Persistence

When to use

Configuration

How it works

Tradeoffs

Stateful Scanning Pattern (repo-memory)

Example Workflow

Key Design Decisions

Summary Comparison

Anti-patterns

FilesExpand file tree

memory.md

Latest commit

History

memory.md

File metadata and controls

Persistent Memory in Agentic Workflows

Quick Decision Guide

Built-in Memory via GitHub Graph and Git History

Practical strategies

Design guidance

cache-memory — First Choice

When to use

Configuration

Storage path

Deduplication example (scheduled workflow)

Stateful Analysis / Baseline Comparison

Tradeoffs

Filename safety

repo-memory — Long-lived Repository Knowledge

When to use

Configuration

Tradeoffs

repo-memory with wiki: true — GitHub Wiki Backend

When to use

Configuration

Tradeoffs

comment-memory — Managed Comment Persistence

When to use

Configuration

How it works

Tradeoffs

Stateful Scanning Pattern (repo-memory)

Example Workflow

Key Design Decisions

Summary Comparison

Anti-patterns

`cache-memory` — First Choice

`repo-memory` — Long-lived Repository Knowledge

`repo-memory` with `wiki: true` — GitHub Wiki Backend

`comment-memory` — Managed Comment Persistence