Skip to content

Add a 'foraging' skill to seed agent-driven research with prior work #5117

@claude

Description

@claude

Description

When agents kick off work under .agents/skills/agent-research/SKILL.md or domain skills like .agents/skills/change-grug/SKILL.md, the "what has already been tried / what's in the literature" phase is ad-hoc. agent-research Section 1 Kickoff currently mentions "key references (papers/blog posts) when applicable" as a single bullet; there is no dedicated workflow for foraging prior work before the first hypothesis lands in the logbook.

Split from #4282 (see the 2026-04-14 and 2026-04-23 sub-thread). @dlwh agreed that we should codify this phase:

I don't think they really need a lot of handholding about "how" to do lit search ... but it is probably wise to codify that phase.

Modern agents are already competent at lit search, so the skill's value is Marin's spin, not generic search heuristics:

  • long-lived research branches as a first-class search surface (research/<topic>, and reference branches like the array-stacked grug variant pointer called out in change-grug)
  • GitHub issues plus the experiment issue template as the durable public artifact
  • .agents/logbooks/<topic>.md as the append-only scratchpad
  • snapshot tags for sealed reference points
  • prior Marin experiments in experiments/, docs/reports/, and W&B as primary corpus before the open web

Scope

Produce .agents/skills/<name>/SKILL.md (working title forage) that an agent runs either:

  1. At agent-research kickoff before the first hypothesis is written, or
  2. As a standalone sub-pass on an existing research thread (e.g. at the start of a new change-grug variant).

Concrete outputs the skill should demand:

  • a Prior work section in the experiment issue body
  • a linked logbook file (e.g. .agents/logbooks/<topic>-forage.md) covering:
    • relevant papers, blog posts, external codebases, each with a 1-sentence TL;DR
    • prior Marin experiments (issues, PRs, tags, long-lived branches, W&B reports)
    • a ranked shortlist of ideas or hyperparameters worth trying next
    • known negative results and dead ends from prior threads
  • explicit stop criteria so the forage phase does not overrun the research phase

Meta-foraging step

Before writing the skill, do a short pass on what is already out there and summarize what to borrow vs. skip. Starting points (from the #4282 thread):

Goal is not to adopt all of these. Most of the core lit-search heuristics are already baked into modern agents. The skill should just encode:

  • Marin-specific search surfaces (our issue/PR/branch/tag corpus, W&B projects, docs/reports/, experiments/)
  • a concrete output schema so downstream agents running agent-research or change-grug can consume the forage artifact without re-reading the web
  • stop criteria and time-box guidance

Definition of Done

  • .agents/skills/<name>/SKILL.md lands on main with YAML frontmatter matching the other skills
  • agent-research kickoff checklist links to it (edit Section 1 Kickoff, or add Section 0 Forage)
  • change-grug adds a foraging pointer so prior Grug variants plus reference branches are surfaced consistently
  • Skill output schema (logbook section plus issue Prior work block) is demonstrated on one real research thread
  • Passes writing-style skill norms (tight, non-editorializing, no filler) and references templates in .github/ISSUE_TEMPLATE/

Explicit non-goals

  • New tools or MCP servers for paper ingestion. Use what the agent already has.
  • A curated Marin paper library. Revisit only if we outgrow agent-native web search.
  • Replacing human lit review for headline decisions.

Links

  • Parent epic: Agentify experimentation #4282
  • Related skills: .agents/skills/agent-research/SKILL.md, .agents/skills/change-grug/SKILL.md, .agents/skills/fix-issue/SKILL.md, .agents/skills/file-issue/SKILL.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    agent-generatedCreated by automation/agentautomationTags issues related to agent automation.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions