Lumen is built on Claude AI agent framework for founders, product leaders, product managers. It runs as a Claude plugin and orchestrates 18 specialized AI agents across six PM workflows — from PMF discovery to GTM launch.
The idea of agent is inspired based on my learning and experience of over 3 decades building, launching and scaling over 115 products for my own business and offering product management consulting to leading clients globally.
License: MIT · Built using Claude Agent Development Kit and requires Claude Code
Using Lumen Claude Code Plugin gives PMs a conversational co-pilot for product management that can:
- Score product-market fit by user segment (via PostHog MCP)
- Run structured customer discovery and build opportunity trees
- Design and power-calculate experiments
- Analyze competitive moats and define positioning
- Generate pricing models, roadmaps, and narrative suites
- Conduct GTM readiness audits with staged rollout plans
- Decompose NRR and identify at-risk accounts
- Enforce AI ethics governance checkpoints (DataLayer)
- Persist all decisions, experiments, and outcomes to a Knowledge Graph (Supabase Integration)
All of this happens through natural language commands inside Claude Code — no separate app, no dashboard.
- 6 end-to-end PM workflows — from early validation to scaled growth
- 18 specialized agents — each expert in one PM domain
- Evidence-graded reports — every recommendation rated HIGH / MEDIUM / LOW
- Human-in-the-loop gates — structured approvals before irreversible decisions
- Works without full setup — graceful degradation when MCP servers are absent
Lumen is based on master agent orchastrator and subagents as Markdown files definitions that Claude Code interprets and executes. It strictly follows the Claude Agent SDK, coding guidelines and security best practices.
/lumen:pmf-discovery → W1: PMF Discovery
/lumen:pmf-recovery → W2: Churn Crisis Recovery
/lumen:strategy → W3: Quarterly Strategy
/lumen:feature → W4: Feature Validation
/lumen:launch → W5: GTM Launch
/lumen:churn → W6: Churn Analysis
Each command triggers Lumen, the Master Orchestrator, which sequences subagents, enforces tier gates, collects human approvals at decision checkpoints, and aggregates evidence quality into a final report.
Each agent is a .md file in agents/ with:
- YAML frontmatter describes name, model, tier, oversight level, required MCP servers
- Markdown sections controls behavior instructions, context slot declarations, output format, evidence quality logic
Example frontmatter:
---
name: signal-monitor
display_name: SignalMonitor
model: claude-haiku-4-5-20251001
layer: intelligence
tier_gate: starter
oversight_level: 0
mcp_servers: [posthog, supabase]
---Agents communicate exclusively through named context slots — 51 canonical slots defined in schemas/context-slots.md. Each agent declares which slots it requires (blocks if absent) and which it optionally reads (improves output quality).
Example slot flow for W1:
EventIQ → writes: validated_event_schema
SignalMonitor → reads: validated_event_schema
→ writes: pmf_score_by_segment
DiscoveryOS → reads: pmf_score_by_segment
→ writes: opportunity_solution_tree
DecideWell → reads: opportunity_solution_tree
→ writes: decision_memo
Lumen runs exclusively in Claude Code — the terminal-based CLI — and will not work in Claude.ai chat, Claude for Teams, or Claude Cowork.
Claude Code is the only Claude runtime that supports all four capabilities Lumen depends on:
| Capability | Claude Code | Claude.ai / Cowork |
|---|---|---|
Slash commands (/lumen:setup, /lumen:pmf-discovery) |
Yes | No |
Agent .md files read and executed at runtime |
Yes | No |
| MCP server connections (PostHog, Supabase, Slack, etc.) | Yes | No |
Local file system access (lumen-setup.json, .env) |
Yes | No |
Because of the current limitations of chat, Teams and Cowork, Lumen's orchestration layer cannot sequence agents, MCP integrations cannot connect to your data, oversight gates cannot pause for approvals, and workflows cannot persist results to the Knowledge Graph.
Claude.ai chat and Cowork are conversation interfaces. Claude Code is a programmable agent runtime. Lumen is built for the latter.
- Claude Code installed
- Node.js 18+
- An Anthropic API key
Option A — Claude Code marketplace (recommended):
/plugin marketplace add ishwarjha/lumen-product-management
/plugin install lumen@lumen-marketplaceOption B — curl installer:
curl -fsSL https://raw.githubusercontent.com/ishwarjha/lumen-product-management/main/install.sh | bashThen restart Claude Code.
Copy the example config and fill in your project details:
cp lumen-setup.example.json lumen-setup.json
cp .env.example .envSet your API key in .env:
ANTHROPIC_API_KEY=your_key_here
All other credentials are optional. Lumen degrades gracefully when MCP servers are absent. To derive the best results, we recommend configuring requisite integration credentials.
/lumen:pmf-discoverySetupGuide runs automatically on first launch and walks you through connecting MCP servers.
These prompts show what you can do with Lumen at different stages of building a product. Paste them directly into Claude Code after installing the plugin.
You have a hypothesis, some users, and event data. You want to know if you're building the right thing.
/lumen:pmf-discovery
Product: Helix — AI-native PM platform for B2B SaaS teams
Segments: Solo (free, individual PMs), Team ($49/seat, product teams 3–15),
Org (enterprise, custom pricing, 50+ users)
Stage: 2,400 MAU, 18 months since launch, PLG motion
PostHog: connected (90-day cohort)
Key question: Team plan D30 retention dropped from 72% to 61% over 8 weeks.
Is this a PMF regression, a product quality issue, or both?
Constraints:
- Cannot deprecate free Solo tier before Q3
- No engineering capacity for features requiring more than 4 weeks to build in Q2
What Lumen does: EventIQ validates your event schema → SignalMonitor scores PMF by segment → DiscoveryOS builds an opportunity tree → MarketIQ maps competitive position → DecideWell recommends which segment to focus on and why.
/lumen:feature
Feature: "AI Sprint Advisor" — analyses the team's current sprint backlog and
suggests reprioritization based on OKR alignment, engineering velocity,
and historical completion rates.
Target users: Team plan users (product teams with 3–10 engineers)
Uses AI/ML: Yes — LLM-based suggestion engine reading backlog and OKR data
Current evidence: W1 identified roadmap-to-sprint handoff as the #1 opportunity.
Key question: Should we build this now, run a fake-door test first, or buy
a third-party API? Flag any ethics requirements for the AI component.
Constraint: 4 engineering weeks available if we decide to build.
What Lumen does: HypothesisLab designs a power-calculated experiment → DataLayer runs an AI ethics checkpoint (Level 3 gate if flagged) → UXLayer produces an interaction design spec → DecideWell outputs a build/buy/test recommendation with rationale.
You have PMF in a segment. Now you need to grow efficiently — improve activation, reduce churn, convert free users to paid.
/lumen:strategy
Product: Helix — B2B SaaS, PLG motion
Quarter: Q3 2026
Current state:
- MAU: 2,400 (Team plan 68%, Org plan 12%, Solo plan 20%)
- NRR: 94% (recovering from Q2 churn event)
- PMF: Team 58/100 · Org 43/100 · Solo 29/100
Engineering capacity: 6 engineers, 2 sprints/month, 78 sprint-points for product work
Known commitments: SOC 2 Type I audit (May 15), 3 customer contract clauses
Board constraint: $2M ARR target by end of Q3
Key question: Should we stay focused on Team plan depth or invest in Org plan breadth?
We cannot do both.
What Lumen does: SignalMonitor surfaces activation drop-off signals → NorthStar defines a North Star metric and OKR cascade (Level 2 approval gate) → PriceLogic models a revised pricing tier → RoadMap outputs a committed/likely/exploratory roadmap → NarrativeEngine writes the internal strategy narrative.
/lumen:churn
Product: Helix, B2B SaaS
Segment: Team plan ($49/seat)
Symptom: MoM churn jumped from 3.1% to 7.4% over 6 weeks, starting around
the time we rolled out the new roadmap editor (March 1 release).
Data available: PostHog behavioral cohorts, Stripe cancellations, HubSpot tickets.
Previously tried: 2-email re-engagement sequence in week 4 — no measurable impact.
Key question: Is this churn from the roadmap editor release or something else?
Who are we about to lose next?
What Lumen does: SignalMonitor identifies churn cohort fingerprint → GrowthIQ decomposes NRR and ranks at-risk accounts → HypothesisLab designs win-back experiments → DecideWell issues a recovery decision memo (Level 2 gate) → OpsCommand drafts a leadership briefing with 30/60/90-day outcome tracking.
You're past early PMF. You're launching to a new market, expanding your platform, or preparing for a Series A.
/lumen:launch
Product: Helix — expanding from Team plan (self-serve) to Org plan (enterprise)
Target date: 6 weeks out
New capabilities shipping: Multi-workspace support (50 users), Slack + Jira integration,
SSO (SAML 2.0), SOC 2 Type I compliance
Pricing: $89/seat/month (Org plan) — no impact on existing Team customers
Sales motion: Moving from fully PLG to PLG + sales-assist for Org accounts
Key risk: Sales team has never run a sales-assist motion before.
Ask: Full GTM readiness audit. Flag anything that would cause us to delay.
What Lumen does: LaunchPad scores launch readiness across 7 dimensions (narrative, pricing, ops, data, legal, support, sales) → NarrativeEngine writes positioning, one-liner, and sales messaging for each audience → PriceLogic models the Org tier → DataLayer checks SOC 2 compliance posture → OpsCommand builds a launch execution plan with Day 1/7/30 checkpoints.
/lumen:strategy
Product: Helix — Series A closed ($8M), team growing from 4 to 12 engineers
Context: Core product is PM platform for B2B SaaS teams. Considering opening a
developer API so product teams can embed Helix data into their own tools.
Connected MCPs: PostHog, Supabase, Stripe, HubSpot
Question: Should we invest in a developer platform this year? If yes, what does the
MVP API look like, what's the right go-to-market, and what are the risks?
What Lumen does: EcosystemOS (Enterprise) designs the API product spec → MarketIQ maps the developer platform competitive landscape → DecideWell structures a build-vs-partner decision with consensus mode → NorthStar adjusts OKRs to reflect the platform bet → RoadMap allocates capacity across core product and platform tracks.
| ID | Name | Key Output | PostHog Required? |
|---|---|---|---|
| W1 | PMF Discovery | PMF scores by segment, opportunity tree, competitive position, recovery roadmap | Yes (PARTIAL mode without it) |
| W2 | PMF Recovery / Churn Crisis | Crisis severity, recovery decision, win-back campaigns, leadership briefing | No |
| W3 | Quarterly Strategy | Quarterly roadmap, pricing model, narrative suite, operating rhythm | No |
| W4 | Feature Validation | Experiment brief, accessibility audit, AI ethics clearance, build/buy/test decision | No |
| W5 | GTM Launch | Launch readiness score, narrative suite, launch execution plan, post-launch monitoring | No |
| W6 | Churn Analysis | NRR decomposition, at-risk accounts, win-back strategy, 30/60/90-day tracking | No |
W1 partial mode: If PostHog is not connected, SignalMonitor is skipped and PMF scoring is unavailable. EventIQ and DiscoveryOS still run with LOW evidence quality.
Lumen has 18 agents across four subscription tiers.
The master entry point for all Lumen workflows. Sequences sub-agents in the correct order, enforces tier gates, pauses for human approvals at decision checkpoints, aggregates evidence quality ratings across all agents, and compiles the final graded PM report.
Runs automatically on first launch. Validates which MCP servers are connected, identifies missing credentials, and configures the workspace for the selected workflow. Ensures Lumen starts every session with an accurate picture of available data sources.
Validates your product event taxonomy against the selected business model. Checks for missing events, naming inconsistencies, and GDPR/CCPA/HIPAA compliance flags. Outputs a verified event schema that all downstream agents rely on for behavioral analysis.
Scores product-market fit by user segment on a 0–100 scale using PostHog behavioral data. Detects churn signals, crisis severity, and activation drop-off patterns. Outputs PMF scores and a churn cohort fingerprint consumed by DiscoveryOS and GrowthIQ.
Synthesizes customer interview data and PMF signals into an opportunity solution tree. Maps user jobs, pains, and gains to concrete product opportunities. Outputs a structured OST that DecideWell and RoadMap use to prioritize the next phase of development.
Defines the single North Star metric that best reflects customer value for the current business model. Cascades it into 3–4 team-level OKRs, flags cross-team conflicts, and outputs the metric formula map used by HypothesisLab and RoadMap downstream.
Designs statistically rigorous experiments to validate product bets. Writes a full hypothesis brief, calculates required sample size and minimum detectable effect, and estimates time-to-significance. Outputs an experiment brief ready for engineering scoping.
Maps the competitive landscape across moat dimensions — switching costs, network effects, data advantages, and brand. Assesses your current position relative to alternatives and outputs a competitive moat map that informs NarrativeEngine and PriceLogic decisions.
Translates PMF signals and competitive positioning into a complete narrative suite: product one-liner, category definition, key claims, proof points, and audience messaging matrix. Outputs copy-ready messaging for sales, marketing, and leadership communications.
Designs or validates the pricing model by analyzing segment willingness to pay, competitor pricing, and NRR impact. Models tier structures and revenue projections across scenarios. Requires Level 1 advisory review before the pricing model is written downstream.
Translates opportunity trees, capacity data, and strategic decisions into a prioritized quarterly roadmap. Buckets work into committed, likely, and exploratory tracks. Flags customer commitment conflicts and capacity overruns. Requires Level 1 advisory review.
Audits GTM readiness across seven dimensions: narrative, pricing, support, data, legal, sales motion, and operations. Scores each dimension and outputs a staged rollout plan. Runs a final close-out gate after launch day to confirm all systems are stable.
Decomposes net revenue retention into expansion, contraction, and churn components. Ranks at-risk accounts by health score, surfaces PQL upgrade signals for PLG products, and designs targeted win-back playbooks. Requires Level 1 advisory review for interventions.
Builds the operating rhythm to keep the team aligned on strategy. Generates executive updates, board briefing structures, and weekly PM rituals calibrated to company stage. Escalates to Level 2 approval for board-level communications or crisis declarations.
Structures complex product decisions using a multi-option framework with evidence weighting. Always requires Level 2 PM approval before writing a decision memo. In Consensus Mode, solicits structured input from multiple stakeholders before issuing a recommendation.
Designs AI-native interaction patterns for product features. Produces wireframe-level interaction specs, flags WCAG accessibility violations, and recommends progressive disclosure patterns for complex AI outputs. Requires Level 1 advisory review before design handoff.
Governs AI ethics and regulatory compliance. Issues GRANTED, BLOCKED, or CONDITIONAL ethics clearance for AI/ML features. Requires Level 3 governance approval from a named authority — the PM cannot self-approve. Monitors for data bias and model decay.
Designs the developer platform strategy and API product spec. Maps time-to-first-call, SDK surface area, and partner integration architecture. Requires Level 2 approval for breaking API changes or platform policy decisions affecting external developers.
Model assignment: Haiku for high-frequency Starter workers (cost-efficient). Sonnet for all Growth/Scale/Enterprise agents requiring complex reasoning.
Lumen enforces four levels of human oversight, increasing in strictness:
| Level | Name | Behavior | Agents |
|---|---|---|---|
| 0 | Automated | No PM interaction required | EventIQ, SignalMonitor, DiscoveryOS |
| 1 | Advisory | PM can override within 24h with reason | PriceLogic, RoadMap, GrowthIQ, LaunchPad, UXLayer, OpsCommand |
| 2 | Approval | Orchestrator pauses, PM must type APPROVE / DECLINE / MODIFY (48h timeout) | DecideWell (always), EcosystemOS, OpsCommand (board/crisis) |
| 3 | Governance | Named authority must respond, PM cannot self-approve (72h timeout) | DataLayer exclusively |
Level 2 gate example:
[LUMEN] Approval Required · Level 2
Workflow: W1 PMF Discovery
Decision: Confirm North Star metric as "Weekly Active Accounts"?
Evidence Quality: HIGH — PostHog data + 8 quarters historical
Options:
APPROVE — Write north_star_definition; proceed to RoadMap
DECLINE — Request alternative metric
MODIFY: [constraint] — Re-run with PM-supplied constraints
Lumen connects to 10 MCP servers. All are optional except PostHog for PMF scoring.
| MCP Server | Purpose | Required for |
|---|---|---|
| PostHog | Analytics, PMF scoring, signal detection | W1 PMF scoring (PARTIAL without) |
| Supabase | Knowledge Graph persistence, operational state | KG writes (silently skipped without) |
| Upstash | Redis session context survival across restarts | Session persistence |
| Slack | Report delivery, approval notifications | Out-of-session reports |
| HubSpot | CRM data, churn signals | Growth tier: SignalMonitor, GrowthIQ |
| Stripe | Billing, NRR decomposition | Growth tier: PriceLogic, GrowthIQ |
| Figma | Design specs, UX reference | Scale tier: UXLayer |
| GitHub | Developer activity, PR velocity | Enterprise: EcosystemOS |
| Sentry | Error monitoring, crisis severity | OpsCommand crisis detection |
| Postman | API test coverage | Enterprise: EcosystemOS |
Graceful degradation: When a server is unavailable, agents skip MCP-dependent steps, lower their evidence quality rating, and continue. The workflow always completes — it just tells you what data was missing.
- Supabase — enables Knowledge Graph + persistent state
- Upstash — session context survives restarts
- Slack — report delivery + approval notifications
- PostHog — W1 PMF scoring (required for full W1)
- HubSpot + Stripe — Growth tier signals
- Figma — Scale tier UX design
- GitHub + Sentry + Postman — Enterprise tier
Every agent writes an evidence_quality slot (HIGH / MEDIUM / LOW). The Orchestrator aggregates across all agents — lowest primary agent rating wins — and annotates every section of the final report with the evidence quality that backed it.
| Primary Agents | Secondary Agents | Overall |
|---|---|---|
| HIGH | HIGH | HIGH |
| HIGH | MEDIUM | MEDIUM |
| HIGH | LOW | MEDIUM (flagged) |
| MEDIUM | any | MEDIUM |
| LOW | any | LOW |
Skipped agents (e.g., SignalMonitor without PostHog) are excluded from aggregation and marked UNAVAILABLE — they don't count as LOW.
When Supabase is connected, Lumen persists analytical results as a Knowledge Graph with 15 entity types:
UserSegment · Opportunity · Persona · Experiment · Feature · Decision · Metric · MetricFormula · Launch · GovernanceEvent · PricingBand · EventTaxonomy · TechDebtItem · RegulatoryConstraint · OrgNode
Entities connect through edges: validates, moves, informs, addresses, impacts, outcome_confirmed_by, and more.
Every Decision entity gets an outcome_tracking_id for 30/60/90-day review cycles. All tables use Supabase Row Level Security scoped to workspace_id.
lumen/
├── agents/ # 18 agent definition files (Markdown)
│ ├── orchestrator.md # Master orchestrator — entry point for all workflows
│ ├── signal-monitor.md
│ ├── discovery-os.md
│ └── ... # One file per agent
├── schemas/
│ ├── context-slots.md # 51 canonical context slots
│ ├── tier-gates.md # Starter / Growth / Scale / Enterprise gates
│ └── workflow-definitions.md # Supabase schema
├── skills/
│ ├── agent-contracts/ # Context slot protocol (required vs optional)
│ ├── knowledge-graph/ # 15 KG entity types
│ ├── oversight-protocol/ # Level 0–3 gate format
│ └── workflow-sequences/ # Step-by-step sequences for W1–W6
├── commands/
│ ├── pmf-discovery.md # /lumen:pmf-discovery
│ ├── pmf-recovery.md # /lumen:pmf-recovery
│ ├── strategy.md # /lumen:strategy
│ ├── feature.md # /lumen:feature
│ ├── launch.md # /lumen:launch
│ ├── churn.md # /lumen:churn
│ ├── setup.md # /lumen:setup
│ ├── status.md # /lumen:status
│ └── approve.md # /lumen:approve
├── mcp-servers/ # MCP server configurations (10 servers)
├── scripts/
│ ├── validate_agents.js # CI: static validation of all agent files
│ ├── simulate_orchestrator.js # Smoke test: W1 against fixture data
│ ├── run_with_claude.js # Live API test (requires ANTHROPIC_API_KEY)
│ └── list_anthropic_models.js
├── tests/fixtures/ # Test data for smoke tests
├── lumen-setup.example.json # Config template → copy to lumen-setup.json
├── .env.example # Environment variable template
└── package.json # Dependencies: js-yaml, glob, @anthropic-ai/sdk
npm install # Install dependencies
npm run validate # Static validation of all agents/*.md
npm run simulate # Smoke test: W1 orchestrator against fixture dataValidation checks:
- Required YAML frontmatter fields:
name,display_name,model,layer,tier_gate,oversight_level,mcp_servers - Required Markdown sections: What I do, Context I read, Context I write, Decision logic, Evidence quality, Output format, and more
Adding a new agent:
- Create
agents/your-agent.mdwith required frontmatter and sections - Register context slots it reads/writes in
schemas/context-slots.md - Add it to the relevant workflow sequence in
skills/workflow-sequences/SKILL.md - Reference it in
agents/orchestrator.mdunder the appropriate workflow - Run
npm run validate— all agents must pass
{
"project": {
"name": "Your Product",
"business_model_type": "B2B SaaS",
"regulatory_jurisdiction": "GDPR"
},
"mcp_connections": {
"posthog": { "enabled": false, "host": "https://app.posthog.com" },
"supabase": { "enabled": false, "url": "" },
"upstash": { "enabled": false, "url": "" },
"slack": { "enabled": false, "ops_channel": "" },
"hubspot": { "enabled": false },
"stripe": { "enabled": false },
"figma": { "enabled": false },
"github": { "enabled": false },
"sentry": { "enabled": false },
"postman": { "enabled": false }
}
}ANTHROPIC_API_KEY=your_key
# Required for W1 PMF scoring
POSTHOG_PROJECT_API_KEY=phc_xxx
# Optional — enables Knowledge Graph persistence
SUPABASE_URL=https://xxx.supabase.co
SUPABASE_ANON_KEY=xxx
# Optional — session survival across restarts
UPSTASH_REDIS_REST_URL=xxx
UPSTASH_REDIS_REST_TOKEN=xxx
# Optional — report delivery + approvals
SLACK_BOT_TOKEN=xoxb-xxx
Never commit .env or lumen-setup.json — both are in .gitignore. The project file has set these rules. Edit .gitignore to add anything else that you add later.
| Command | Workflow | Description |
|---|---|---|
/lumen:setup |
— | Run MCP validation and project configuration |
/lumen:status |
— | Show current workflow state, connected MCPs, tier |
/lumen:pmf-discovery |
W1 | Full PMF discovery (PostHog recommended) |
/lumen:pmf-recovery |
W2 | Churn crisis analysis and recovery planning |
/lumen:strategy |
W3 | Quarterly strategy, roadmap, and narrative |
/lumen:feature |
W4 | Feature validation with ethics checkpoint |
/lumen:launch |
W5 | GTM readiness audit and launch execution plan |
/lumen:churn |
W6 | NRR decomposition and win-back strategy |
/lumen:approve |
— | Respond to a Level 2/3 approval gate |
| Starter | Growth | Scale | Enterprise | |
|---|---|---|---|---|
| Price | Free | $49/mo | $149/mo | Custom |
| Agents | 7 | 14 | 17 | 18 |
| PMF scoring | Weekly | Real-time | Real-time | Real-time |
| Experiments | 2 active | Unlimited | Unlimited | Unlimited |
| Consensus mode | — | — | ✓ | ✓ |
| AI ethics gate | — | — | ✓ | ✓ |
| KG persistence | Optional | ✓ | ✓ | ✓ |
| Slack delivery | — | ✓ | ✓ | ✓ |
| Developer platform | — | — | — | ✓ |
MIT — see LICENSE.
Contributions welcome. To add an agent, fix agent behavior, or improve workflow sequences:
- Fork the repo
- Create a branch:
git checkout -b feat/your-agent - Make your changes (agents are just Markdown — no compilation needed)
- Run
npm run validate && npm run simulate - Open a pull request
For substantial changes, open an issue first to discuss the approach.
Built adhering to the best practices advised for building AI Agents and plugins and strictly follows Claude Agent SDK guidelines. No tracking is enabled. Agent orchestration runs entirely within Claude Code.