Skip to content

Latest commit

 

History

History
425 lines (314 loc) · 17.6 KB

File metadata and controls

425 lines (314 loc) · 17.6 KB
description Guide for choosing the right persistent memory strategy in agentic workflows — cache-memory, repo-memory, and repo-memory with wiki. Covers deduplication, stateful baseline comparison (metrics/coverage), and stateful scanning ("alert on new X").

Persistent Memory in Agentic Workflows

Consult this file when designing a workflow that needs to persist state across runs — deduplication, incremental processing, cross-run context, or knowledge accumulation.

⚠️ repo-memory does NOT mean "cache-memory". They are two distinct tools with different backends, tradeoffs, and use cases. cache-memory is almost always the right first choice.


Quick Decision Guide

Need Use
Skip already-processed items (deduplication) cache-memory ✅ first choice
Round-robin processing across runs cache-memory ✅ first choice
Store ephemeral run state, analysis notes, or intermediate results cache-memory ✅ first choice
Track a numeric metric and compare current vs. baseline (runs at least every 7 days) cache-memory ✅ first choice
Long-lived knowledge base visible in PRs and code reviews repo-memory
Baselines that must survive cache expiry (e.g. security findings, dedup lists) repo-memory
Human-readable wiki pages for knowledge accumulation repo-memory with wiki: true
Persist notes/state inline on the triggering issue or PR comment-memory

Default to cache-memory unless you have a specific reason to use repo-memory.


Built-in Memory via GitHub Graph and Git History

Before writing new persistent files, check whether GitHub and Git already expose the state you need.

Practical strategies

Goal Built-in source Caching strategy
Skip stale files in docs/code scans Git history (git log / last modified commit per file) Cache either a single repo watermark SHA or per-file SHAs, then compare changed paths in newer commits
Avoid reopening known incidents Issue/PR history (recent open + closed items by label/title prefix) Cache only canonical identifiers (issue numbers, advisory IDs), not full issue payloads
Process incrementally across repo activity PR merge history (merged_at, base branch) Cache the last merged PR number or merge timestamp and fetch only newer merges
Keep nightly triage focused Issue timeline (updated_at, comments) Cache the last scan cursor (updated_at watermark) and only inspect newer updates
Reuse expensive relationship lookups GitHub graph links (issue ↔ PR ↔ commit references) Cache normalized link maps keyed by stable node IDs and refresh selectively

Design guidance

  • Prefer stable identifiers from GitHub graph data (node_id, issue/PR number, commit SHA) over mutable text fields.
  • Persist watermarks (last seen timestamp, commit SHA, PR number) instead of full snapshots when possible.
  • Use built-in history as the source of truth; use memory tools to store only incremental state needed to resume efficiently.
  • If history queries are cheap and deterministic (for example, bounded to recent activity like the latest 20-100 items), recompute from GitHub/git instead of storing large derived datasets.

cache-memory — First Choice

Uses GitHub Actions cache (actions/cache) to persist a local filesystem directory populated by the @modelcontextprotocol/server-memory MCP server. The directory lives at /tmp/gh-aw/cache-memory/.

When to use

  • Deduplication: Track which items (issues, PRs, URLs, IDs) have already been processed
  • Round-robin / incremental processing: Remember where you left off across scheduled runs
  • Ephemeral structured state: JSON blobs, processing queues, intermediate analysis results
  • Metric baseline comparison: Store a coverage %, score, or count and compare on the next run (see Stateful Analysis / Baseline Comparison below)
  • Visual regression baselines: Store screenshots between PR runs (see visual-regression.md)
  • Tool call caching: Avoid redundant expensive API calls across runs

Configuration

tools:
  cache-memory: true

Advanced — custom key:

tools:
  cache-memory:
    key: dedup-${{ github.event.schedule }}-${{ github.run_id }}
    retention-days: 30
    allowed-extensions: [".json"]

Multiple named caches:

tools:
  cache-memory:
    - id: processed
      key: processed-items-${{ github.run_id }}
    - id: results
      key: results-${{ github.run_id }}
      retention-days: 14

Storage path

  • Single cache: /tmp/gh-aw/cache-memory/
  • Multiple caches: /tmp/gh-aw/cache-memory/{id}/

Deduplication example (scheduled workflow)

The following pattern lets a scheduled workflow skip items it has already processed:

---
on:
  schedule:
    - cron: "0 9 * * *"
permissions:
  issues: read
engine: copilot
tools:
  github:
    toolsets: [issues]
  cache-memory: true
safe-outputs:
  create-issue:
    title-prefix: "[daily-digest] "
    close-older-issues: true
    labels: [automation]
timeout-minutes: 15
---

Fetch the 20 most recently updated open issues.

Load `/tmp/gh-aw/cache-memory/processed.json` if it exists; it contains an array of
issue numbers already included in past digests.

Skip any issue whose number already appears in that array.

Summarize the remaining (new) issues. If there are none, use the `noop` safe output.

Before finishing, write the updated full list of processed issue numbers back to
`/tmp/gh-aw/cache-memory/processed.json` using a filesystem-safe timestamp:
`YYYY-MM-DD-HH-MM-SS` (no colons, no `T`, no `Z`).

Stateful Analysis / Baseline Comparison

Use cache-memory to persist a baseline metric between runs and detect regressions. This pattern is well-suited for any "compare current vs. previous" scenario — test coverage, build duration, benchmark scores, audit counts — where runs happen at least once every 7 days (the default cache retention).

When to use this pattern

  • Tracking a numeric metric (coverage %, build time, test count, score) across scheduled or PR runs
  • Alerting when a metric regresses by more than an acceptable threshold
  • Any "tell me when X drops by more than Y" workflow where losing the baseline for a cycle is tolerable (the next run simply re-establishes it)

When to use repo-memory instead

If a lost baseline would cause serious side-effects — e.g. a security-finding baseline where "cache miss" floods the repo with duplicate issues — use repo-memory. See Stateful Scanning Pattern (repo-memory) below.

Worked example: coverage delta on every PR

---
description: Post a PR comment when test coverage drops by more than 1 percentage point
on:
  pull_request:
    types: [opened, synchronize]
permissions:
  pull-requests: read
  contents: read
engine: copilot
tools:
  github:
    toolsets: [pull_requests]
  cache-memory: true
safe-outputs:
  add-pr-comment:
    max: 1
timeout-minutes: 15
---

Run the test suite and collect the overall line-coverage percentage as a
single float (e.g. `82.5`).

Load `/tmp/gh-aw/cache-memory/coverage-baseline.json` if it exists.
The file stores: `{ "coverage": 82.5, "updated": "2026-05-01-09-00-00" }`.

**First run** (file missing): write the current coverage to the file and use
the `noop` safe output — no comment is needed yet.

**Subsequent runs** (baseline found): compute `delta = current − baseline`.

- If `delta >= −1.0` (coverage held or improved), use the `noop` safe output.
- If `delta < −1.0` (coverage fell by more than 1 pp), post an `add-pr-comment`
  that includes:
  - Baseline coverage, current coverage, and delta (e.g. "82.5% → 79.3% (−3.2 pp)")
  - Which files lost the most coverage

Regardless of the outcome, overwrite `/tmp/gh-aw/cache-memory/coverage-baseline.json`
with the current coverage and a filesystem-safe timestamp `YYYY-MM-DD-HH-MM-SS`
(no colons, no `T`, no `Z`).

Key design decisions

  • cache-memory not repo-memory — coverage deltas are short-lived quality gates; a cache miss just means "no comparison this run" and the baseline is silently refreshed — no false-positive flood
  • First-run handling — treat a missing baseline as "no data yet": write it and skip the comparison; the second run is the first real gate
  • Threshold guard — ignore sub-1 pp fluctuations to reduce noise; tune the threshold to your team's standards
  • Filename safety — use YYYY-MM-DD-HH-MM-SS (no colons) in any timestamped filenames written to cache-memory; see Filename safety below

Tradeoffs

✅ Pros ❌ Cons
Zero repository noise — no commits, no PRs Evicted when cache expires (default 7 days; use retention-days to extend up to 90)
Fast: no Git operations required Not human-readable in GitHub UI
Works with Copilot, Claude, and custom engines Data loss if cache is invalidated or expires
Supports multiple isolated caches per workflow Files are uploaded as GitHub Actions artifacts — no colons in filenames
Scoped to workflow by default

Filename safety

Cache-memory files are uploaded as GitHub Actions artifacts. Artifact filenames must not contain colons (NTFS limitation on Windows-hosted runners).

# ✅ GOOD — filesystem-safe timestamp
/tmp/gh-aw/cache-memory/state-2026-02-12-11-20-45.json

# ❌ BAD — colon in timestamp breaks artifact upload
/tmp/gh-aw/cache-memory/state-2026-02-12T11:20:45Z.json

When instructing the agent to write timestamped files, say explicitly:

"Use filesystem-safe timestamp format YYYY-MM-DD-HH-MM-SS (no colons, no T, no Z)."


repo-memory — Long-lived Repository Knowledge

Uses a dedicated Git branch (default: memory/agent-notes) to store files that persist indefinitely until explicitly deleted. The directory lives at /tmp/gh-aw/repo-memory/.

When to use

  • The knowledge needs to survive cache expiration
  • You want the memory to be visible in the repository (auditable via Git history)
  • The workflow accumulates a knowledge base that grows over time (e.g., architecture notes, known issues)
  • You need changes to appear in diffs and be reviewable

Configuration

tools:
  repo-memory:
    branch-name: memory/agent-notes   # Optional: custom branch name
    target-repo: owner/other-repo     # Optional: store in another repo
    allowed-extensions: [".json", ".md"]
    max-file-size: 10240              # bytes
    max-file-count: 100

The compiler automatically creates a separate push_repo_memory job with contents: write permission. The main agent job retains read-only permissions.

Tradeoffs

✅ Pros ❌ Cons
Persists indefinitely (no expiry) Produces Git commits — repository noise
Auditable: Git history shows every change Produces Git commits — repository noise
Survives cache invalidation Slower: requires Git clone + push
Human-readable via GitHub branch UI Not available for Copilot engine (requires GitHub tools)
Can target a different repository More complex setup

repo-memory with wiki: true — GitHub Wiki Backend

A variant of repo-memory that stores files in the GitHub Wiki (a separate Git repository at <repo>.wiki.git) instead of a branch.

When to use

  • You want structured, human-readable documentation pages
  • The knowledge is intended for human consumption (wikis are browsable)
  • You're building a living knowledge base or FAQ

Configuration

tools:
  repo-memory:
    wiki: true
    allowed-extensions: [".md"]

The compiler automatically creates a separate push_repo_memory job with contents: write permission. The main agent job retains read-only permissions.

Files follow GitHub Wiki Markdown conventions: use [[Page Name]] syntax for internal links, name files with hyphens instead of spaces.

Tradeoffs

✅ Pros ❌ Cons
Browsable in the GitHub Wiki UI Produces Git commits to wiki repo
Great for human-readable knowledge bases Produces Git commits to wiki repo
Standard Markdown with wiki link syntax Restricted to .md files in practice
Separate from main repo history Less suitable for structured JSON state

comment-memory — Managed Comment Persistence

Uses a dedicated <gh-aw-comment-memory> XML block in an issue or PR comment as persistent memory. The agent edits plain markdown files under /tmp/gh-aw/comment-memory/; the safe-output processor syncs the changes back to the managed comment.

When to use

  • Persist workflow notes or statuses visible inline on the triggering issue or PR
  • State tied to the lifecycle of a specific issue or PR
  • Structured running track records (status tables, checklists, summaries) the team can read without leaving the issue

Do NOT use comment-memory for high-volume ephemeral state (use cache-memory), long-lived knowledge bases (use repo-memory), or data that must survive across issues/PRs.

Configuration

tools:
  comment-memory: true   # enable with defaults

Advanced:

tools:
  comment-memory:
    memory-id: status          # Optional: identifier in XML marker (default: "default")
    target: triggering         # Optional: "triggering" (default), "*", or explicit number
    target-repo: owner/other   # Optional: cross-repository
    max: 1                     # Optional: max updates per run (default: 1)
    footer: false              # Optional: omit AI-generated footer (default: true)

How it works

  1. Pre-agent setup: Reads <gh-aw-comment-memory id="<memory-id>"> from the target comment and writes content to /tmp/gh-aw/comment-memory/<memory_id>.md.
  2. Agent: Edits the markdown file directly — no explicit safe-output tool call needed.
  3. Post-agent: The safe-output processor reads the edited file and upserts the managed comment, replacing only the XML-fenced block.

Multiple memory IDs are supported in a single comment; each maps to a separate *.md file.

Tradeoffs

✅ Pros ❌ Cons
Visible in GitHub UI inline on the issue/PR Requires issues:write or pull-requests:write
No separate branch or cache One comment block per memory-id per target
Agent edits plain markdown — no tool call needed Not suited for large structured data
Tied to issue/PR lifecycle Not available without a triggering issue or PR

Stateful Scanning Pattern (repo-memory)

Use repo-memory to persist a baseline JSON file between scheduled runs so that the workflow only alerts on new findings — vulnerability scans, dependency audits, licence checks, or any "track changes over time" scenario.

Example Workflow

---
description: Nightly npm vulnerability scan — alerts only on new advisories
on:
  schedule:
    - cron: "0 2 * * *"
permissions:
  issues: write
  contents: read
engine: claude
tools:
  repo-memory:
    allowed-extensions: [".json"]
network:
  allowed:
    - registry.npmjs.org
safe-outputs:
  create-issue:
    title-prefix: "[vuln] "
    labels: [security, automated]
    max: 5
timeout-minutes: 20
---

Load `/tmp/gh-aw/repo-memory/default/vuln-baseline.json`.
If missing, treat the baseline as `[]` (first run).

Run `npm audit --json`. Collect each advisory's id, severity, title, and URL.

Diff against the baseline:
- **New** (in current, not in baseline) → open a `create-issue` per finding (max 5).
- **Resolved** (in baseline, not in current) → log only.
- If no new findings, use the `noop` safe output.

Write the current advisory IDs to `/tmp/gh-aw/repo-memory/default/vuln-baseline.json` as a JSON array.

Key Design Decisions

  • repo-memory for baselines, not cache-memory — caches expire after 7 days; a lost baseline makes every known finding appear "new" on the next run, flooding the repo with duplicate issues
  • First-run handling — treat a missing baseline file as [] and write it at the end of the first run, giving subsequent runs a clean starting point
  • max: flood guard — caps issues opened per run; use max: 5 for nightly scans, max: 1 for secret alerts, max: 10 for weekly audits
  • Engine restrictionrepo-memory requires Claude or a custom engine; it is not available for the Copilot engine
  • Baseline schema — store only stable identifiers (advisory ID strings), not mutable fields like severity, to avoid false "new" alerts when metadata changes

Summary Comparison

Feature cache-memory repo-memory repo-memory + wiki comment-memory
First choice ✅ Yes No No No
Storage backend GitHub Actions cache Git branch GitHub Wiki Issue/PR comment
Persistence Up to 90 days Indefinite Indefinite Issue/PR lifetime
Compiler adds contents: write No Yes (push job) Yes (push job) No
Repository noise None Git commits Wiki commits Comment updates
Human-readable in GitHub No Via branch UI Via Wiki UI ✅ Inline on issue/PR
Structured data (JSON) ✅ Ideal Possible Not recommended Not recommended
Filename restrictions No colons in names None Hyphens for spaces None
Engine compatibility Copilot, Claude, custom Claude, custom Claude, custom Claude, custom

Anti-patterns

  • Do not invent repo-memory as a synonym for cache-memory — they are different tools
  • Do not use repo-memory for ephemeral per-run state — use cache-memory
  • Do not use cache-memory when you need indefinite persistence — use repo-memory
  • Do not include colons in cache-memory filenames — artifact upload will fail