feat: Agent Marketplace with Fork/Remix + Self-Improving Memory Loop

## Metadata

> **IMPORTANT**: The very first step should _ALWAYS_ be validating this metadata section to maintain a **CLEAN** development workflow.

```yml
pull_request_title: "FROM feat/[issue#]-agent-marketplace TO development"
branch: "feat/[issue#]-agent-marketplace"
worktree_path: "$WORKSPACE/.worktrees/feat-[issue#]"
```

---

## User Stories

- As a **visitor**, I want to **fork/remix a public agent** so that **I can customize it for my own use and become an Orchestra user**.
- As an **agent creator**, I want **fork counts and discovery features** so that **my agents gain visibility and I can track adoption**.
- As a **website owner**, I want to **embed a public agent as a chat widget** so that **visitors can interact with AI on my site without leaving**.
- As an **agent owner**, I want **my agent's prompts to automatically improve over time** so that **conversations get better without manual tuning**.

---

## Summary

Every public agent is currently a dead end — users can chat with it but can't own it, customize it, or embed it on their site. There's no path from "I found a cool agent" to "I'm an Orchestra user." Meanwhile, conversation intelligence dies after each session — langmem is installed but disconnected from the conversation lifecycle.

This feature creates a self-reinforcing flywheel where published agents drive user acquisition (via fork/embed), forked agents accumulate usage data, and that data automatically improves agent prompts via background distillation.

The public agent infrastructure is ~80% built but has no viral loop. Fork/remix creates a GitHub-for-agents growth flywheel. The memory distillation loop rides alongside nearly free (langmem is already installed and unused). Workflows are deferred until marketplace demand validates them.

### Visual Reference



---

## Phased Implementation

### Phase 1: Agent Fork/Remix (Highest immediate ROI)

**Backend:**
- `backend/src/services/assistant.py` — Add `fork()` method: deep-copy public assistant, strip owner/published metadata, generate new UUID, set `forked_from` in metadata, increment `fork_count` on source
- `backend/src/schemas/entities/llm.py` — Add `fork_count: int = 0` to `Assistant` model, `forked_from: Optional[str]` to metadata, expose in `PublicAssistant` projection
- `backend/src/routes/v0/assistant.py` — Add `POST /assistants/public/{assistant_id}/fork` (requires auth)

**Frontend:**
- `frontend/src/lib/services/agentService.ts` — Add `fork(assistantId)` API call
- `frontend/src/pages/agents/public.tsx` — Add "Remix this Agent" button; auth redirect flow
- `frontend/src/pages/Login.tsx` / `Register.tsx` — Read `remix` query param, auto-fork post-auth

### Phase 2: Agent Discovery Experience

**Backend:**
- Enhance `GET /assistants/public` with `sort_by` param: `fork_count`, `updated_at`, `published_at`
- Add `tags: list[str] = []` to `Assistant` model, expose in `PublicAssistant` projection

**Frontend:**
- Add "Discover" tab in agents index showing public agents
- Agent cards: name, description, model, fork_count badge, "Remix" button
- Sort dropdown: "Most Remixed", "Newest", "Recently Updated"

### Phase 3: Embeddable Agent Widget

**Backend:**
- `GET /assistants/public/{assistant_id}/embed` — returns embed config JSON
- `POST /assistants/public/{assistant_id}/embed-token` — signed JWT with agent_id + rate_limit + expiry

**Frontend:**
- New standalone Vite entry point: `frontend/src/embed/index.tsx`
- Minimal chat widget (<50KB bundle): floating button + expandable chat panel
- Loads via `<script src="https://chat.ruska.ai/embed.js" data-agent-id="...">`
- "Powered by Orchestra" footer linking to `/a/{agentId}`
- Add "Embed" section in agent edit page for published agents with copy-paste snippets

### Phase 4: Memory Distillation Loop

**Post-Conversation Trajectory Extraction:**
- Hook into `_persist_final_state()` in `backend/src/utils/stream.py` (line 278)
- Extract `AnnotatedTrajectory` from conversation messages with implicit quality signals
- Store in `(user_id, "trajectories", assistant_id)` namespace via `AsyncPostgresStore`
- Add `extract_trajectory` TaskIQ task (non-blocking, async after conversation)

**Periodic Prompt Distillation:**
- Extend `PromptOptimizer` in `backend/src/services/prompt/optimize.py` with `distill()` method
- Retrieve recent trajectories, invoke `create_prompt_optimizer`, store as new revision via `PromptService.revision()`
- Add `scheduled_prompt_distillation` job type in `backend/src/services/schedule.py` (daily, per-user, per-assistant)

---

## Key Integration Points

| File | Function(s) | Role |
|------|-------------|------|
| `backend/src/services/assistant.py` | `fork()` | Deep-copy public assistant to user namespace |
| `backend/src/schemas/entities/llm.py` | `Assistant`, `PublicAssistant` | Add fork_count, forked_from, tags fields |
| `backend/src/routes/v0/assistant.py` | Fork + embed endpoints | New REST API surface |
| `backend/src/services/prompt/optimize.py` | `PromptOptimizer.optimize()` | Wire to auto-generated trajectories |
| `backend/src/utils/stream.py` | `_persist_final_state()` (line 278) | Hook point for trajectory extraction |
| `backend/src/controllers/llm.py` | `_update_store()` (line 49) | Sync invoke hook point |
| `backend/src/services/schedule.py` | APScheduler infra | Schedule distillation jobs |
| `backend/src/workers/tasks.py` | TaskIQ broker | Run extraction as background task |
| `backend/src/services/db.py` | `AsyncPostgresStore` namespaces | Store trajectories without new DB tables |

---

## UI Integration Points

| Component / Route | Change Type | Description |
|-------------------|-------------|-------------|
| `frontend/src/pages/agents/public.tsx` | Modify | Add "Remix this Agent" button |
| `frontend/src/pages/Login.tsx` / `Register.tsx` | Modify | Handle `remix` query param for post-auth auto-fork |
| `frontend/src/pages/agents/index.tsx` | Modify | Add "Discover" tab with public agent cards |
| `frontend/src/pages/agents/edit.tsx` | Modify | Add "Embed" section for published agents |
| `frontend/src/lib/services/agentService.ts` | Modify | Add `fork()` API method |
| `frontend/src/embed/index.tsx` | New entry | Standalone embeddable chat widget (<50KB) |

---

## Storage

- **Persistence layer**: LangGraph `AsyncPostgresStore` (existing)
- **Namespaces**: `("public", "assistants")` for fork source; `(user_id, "trajectories", assistant_id)` for conversation trajectories
- **Model pattern**: Pydantic models following existing `Assistant` / `PublicAssistant` patterns
- **Embed tokens**: Signed JWTs (no new DB table needed)

---

## Architectural Decisions

- **Source of truth**: Backend — fork_count, tags, and trajectory data are server-persisted
- **State management**: React Query cache invalidation on fork/remix mutations
- **Auth / scoping**: Fork requires auth; embed uses signed JWT with rate limiting; trajectories are user-scoped
- **Embed widget**: Standalone Vite entry point, <50KB bundle, no user auth required (token-based)
- **Memory distillation**: Uses existing `langmem` + `PromptOptimizer` — no new dependencies
- **Phase 5 (Workflow Engine)**: Explicitly deferred until marketplace validates demand

---

## Existing Code to Reuse

| What exists | Where | How to use |
|---|---|---|
| `PromptOptimizer.optimize()` | `backend/src/services/prompt/optimize.py:72` | Wire to auto-generated trajectories |
| `AnnotatedTrajectory` format | `langmem.prompts.types` (already imported) | Structure extracted conversation data |
| `PromptService` with revisions | `backend/src/services/prompt/__init__.py` | Store optimized prompts as new revisions |
| `_persist_final_state()` | `backend/src/utils/stream.py:278` | Hook point for trajectory extraction |
| `_update_store()` | `backend/src/controllers/llm.py:49` | Sync invoke hook point |
| APScheduler infra | `backend/src/services/schedule.py` | Schedule distillation jobs |
| TaskIQ broker | `backend/src/workers/tasks.py` | Run extraction as background task |
| `AsyncPostgresStore` namespaces | `backend/src/services/db.py` | Store trajectories without new DB tables |

---

## Development Setup

### Dependencies

| Service | Address | Notes |
|---------|---------|-------|
| Redis | `localhost:6379` | Docker container |
| Postgres | `localhost:5432` | Docker container |

### Commands

```bash
# See CLAUDE.md for build, test, and dev commands
```

### Wiki

> **⚠️ IMPORTANT:** Wiki lives in a **separate repo**. Changes to `@wiki` must be committed directly to the wiki repo, **not** the main project repo.

---

## Design Principles

- Simplicity is beauty, complexity is pain.
- _ALWAYS_ look at the current codebase first — achieve the goal in the **least amount of changes**.
- TDD-first: write tests before implementation.
- Reuse existing infrastructure (langmem, AsyncPostgresStore, TaskIQ, APScheduler) — no new dependencies.
- Phase incrementally: each phase delivers standalone value.

---

## Validation Tools

- [ ] Load `agent-browser` skill with screenshots to validate E2E. This validates test assumptions for completion promise.

---

## Acceptance Criteria

### Phase 1
- [ ] `POST /assistants/public/{id}/fork` creates a user-owned copy with `forked_from` metadata
- [ ] `fork_count` increments on the source public assistant
- [ ] Frontend "Remix this Agent" button works for authenticated users
- [ ] Unauthenticated users redirected to login, auto-fork on return
- [ ] `make test` — all existing + new tests pass

### Phase 2
- [ ] `GET /assistants/public?sort_by=fork_count` returns correctly ordered results
- [ ] `tags` field added and exposed in public assistant projection
- [ ] Frontend "Discover" tab renders public agents with fork badges and sort controls

### Phase 3
- [ ] Embed token generation works (JWT with rate_limit + expiry)
- [ ] `embed.js` loads and renders chat widget on external HTML page
- [ ] Rate limiting enforced (429 after threshold)
- [ ] "Powered by Orchestra" footer links to public agent page

### Phase 4
- [ ] Conversation completion triggers trajectory extraction (non-blocking)
- [ ] Trajectories stored in `(user_id, "trajectories", assistant_id)` namespace
- [ ] Manual distillation trigger creates new prompt revision
- [ ] Scheduled distillation job runs daily

### General
- [ ] All previous & new tests pass, validated using `agent-browser` CLI
- [ ] New code follows existing repo/service/route patterns (BaseRepo, ServiceContext)
- [ ] No new dependencies added beyond what's already in the project (or justified in PR description)
- [ ] Related API and user documentation updated in the **wiki repo** (committed directly to wiki repo)

What exists	Where	How to use
`PromptOptimizer.optimize()`	`backend/src/services/prompt/optimize.py:72`	Wire to auto-generated trajectories
`AnnotatedTrajectory` format	`langmem.prompts.types` (already imported)	Structure extracted conversation data
`PromptService` with revisions	`backend/src/services/prompt/__init__.py`	Store optimized prompts as new revisions
`_persist_final_state()`	`backend/src/utils/stream.py:278`	Hook point for trajectory extraction
`_update_store()`	`backend/src/controllers/llm.py:49`	Sync invoke hook point
APScheduler infra	`backend/src/services/schedule.py`	Schedule distillation jobs
TaskIQ broker	`backend/src/workers/tasks.py`	Run extraction as background task
`AsyncPostgresStore` namespaces	`backend/src/services/db.py`	Store trajectories without new DB tables

File	Function(s)	Role
`backend/src/services/assistant.py`	`fork()`	Deep-copy public assistant to user namespace
`backend/src/schemas/entities/llm.py`	`Assistant`, `PublicAssistant`	Add fork_count, forked_from, tags fields
`backend/src/routes/v0/assistant.py`	Fork + embed endpoints	New REST API surface
`backend/src/services/prompt/optimize.py`	`PromptOptimizer.optimize()`	Wire to auto-generated trajectories
`backend/src/utils/stream.py`	`_persist_final_state()` (line 278)	Hook point for trajectory extraction
`backend/src/controllers/llm.py`	`_update_store()` (line 49)	Sync invoke hook point
`backend/src/services/schedule.py`	APScheduler infra	Schedule distillation jobs
`backend/src/workers/tasks.py`	TaskIQ broker	Run extraction as background task
`backend/src/services/db.py`	`AsyncPostgresStore` namespaces	Store trajectories without new DB tables

Component / Route	Change Type	Description
`frontend/src/pages/agents/public.tsx`	Modify	Add "Remix this Agent" button
`frontend/src/pages/Login.tsx` / `Register.tsx`	Modify	Handle `remix` query param for post-auth auto-fork
`frontend/src/pages/agents/index.tsx`	Modify	Add "Discover" tab with public agent cards
`frontend/src/pages/agents/edit.tsx`	Modify	Add "Embed" section for published agents
`frontend/src/lib/services/agentService.ts`	Modify	Add `fork()` API method
`frontend/src/embed/index.tsx`	New entry	Standalone embeddable chat widget (<50KB)

Service	Address	Notes
Redis	`localhost:6379`	Docker container
Postgres	`localhost:5432`	Docker container

Uh oh!

Uh oh!

feat: Agent Marketplace with Fork/Remix + Self-Improving Memory Loop #885

Description

Metadata

User Stories

Summary

Visual Reference

Phased Implementation

Phase 1: Agent Fork/Remix (Highest immediate ROI)

Phase 2: Agent Discovery Experience

Phase 3: Embeddable Agent Widget

Phase 4: Memory Distillation Loop

Key Integration Points

UI Integration Points

Storage

Architectural Decisions

Existing Code to Reuse

Development Setup

Dependencies

Commands

Wiki

Design Principles

Validation Tools

Acceptance Criteria

Phase 1

Phase 2

Phase 3

Phase 4

General

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions