Memee — Institutional Memory for AI Agent Companies

Your agents forget. Memee doesn't.

Cross-project, cross-model organizational memory. Patterns learned in one project spread to others. Mistakes are recorded and prevented org-wide. Knowledge matures through confidence scoring. Smart routing delivers only relevant knowledge — not a dump.

Architecture

┌─────────────── AGENTS (Claude, GPT, Gemini, Llama) ───────────────┐
│  Session hook / MCP tools / CLI / REST API                         │
├───────────────────────────────────────────────────────────────────┤
│                         MEMEE ENGINE                               │
│                                                                    │
│  PUSH (knowledge → agent):          PULL (agent → knowledge):     │
│    router.py    — smart briefing      search.py  — hybrid BM25+vec│
│    briefing.py  — AGENTS.md / CLAUDE.md inject    review.py  — git diff check │
│    feedback.py  — post-task loop      predictive — AP scan        │
│    hooks_config — settings.json wire  citations  — memee why/cite │
│                                                                    │
│  QUALITY:                           LEARNING:                     │
│    quality_gate — validate+dedup      confidence — adaptive scoring│
│    plugins.py   — memee-team hooks    lifecycle  — aging+promote  │
│    models.py    — model family detect dream.py   — nightly process│
│    reranker.py  — cross-encoder       impact.py  — real ROI track │
│                                                                    │
│  GROWTH:                            DELIVERY:                     │
│    propagation  — cross-project push  packs.py   — .memee export  │
│    inheritance  — onboard from similar adapters/cmam — CMAM bridge│
│    benchmarks   — OrgMemEval scoring  packs_format — file helpers │
│                                                                    │
├───────────────────────────────────────────────────────────────────┤
│  SQLite + FTS5 + Embeddings (384-dim) | ~/.memee/memee.db          │
└───────────────────────────────────────────────────────────────────┘

Stack: Python 3.11+ · SQLAlchemy 2.0 · SQLite · Click · FastMCP · FastAPI · sentence-transformers

Quick Commands

# Setup
pip install -e ".[dev]"
memee setup                       # Interactive wizard
memee doctor                      # Health check + auto-configure AI tools

# Smart briefing (PUSH — tells agent what it needs)
memee brief --task "write tests"  # Token-budgeted, task-routed
memee inject --project .          # Write org knowledge into CLAUDE.md

# Record knowledge
memee record pattern "title" --tags python,api -c "content with WHY and WHEN"
memee warn "title" --trigger "when" --consequence "what" --severity high
memee decide "X" --over "Y,Z" --reason "why"

# Search + check (PULL — agent asks)
memee search "query"
memee check "what I'm about to do"
memee suggest --context "my current task"

# Intelligence
memee propagate                   # Push patterns cross-project
memee dream                       # Nightly: connect, find contradictions, promote
memee review -                    # Pipe git diff for institutional review
memee why "<code>"                # Canon that would have prevented or explained it
memee cite <hash> [--confirm]     # Resolve a [mem:abc12345] to lineage
memee embed                       # Generate vector embeddings

# Memory packs (new in v2.0.0)
memee pack export --canon-only > my-team.memee
memee pack install python-web    # Seed pack from packs/seed/
memee pack install <FILE|--from-url URL>
memee pack list

# Analytics
memee status                      # Terminal dashboard (web dashboard removed in v2)
memee benchmark                   # OrgMemEval: 92.2/100
memee demo --weeks 52             # Generate demo data

# CMAM bridge — push canon to Claude Managed Agents Memory
memee cmam sync                   # Canon + critical APs → /mnt/memory/ layout
memee cmam sync --dry-run         # Preview without writing
memee cmam status                 # Store size, count, headroom

Engine Modules (15) + Adapters

Module	Purpose	Impact
`confidence.py`	Adaptive scoring + maturity lifecycle	Core
`search.py`	Hybrid BM25 + vector + tag-graph + reranker	Core
`lifecycle.py`	Aging, auto-archive 60d, invalidation ratio deprecation	Core
`quality_gate.py`	Validate + dedup + source classify + quality score	Core
`router.py`	Smart task-aware briefing, 500 token budget, query expansion	Core
`briefing.py`	CLAUDE.md injection, pre-task briefing generation	PUSH
`feedback.py`	Post-task review, teaching effectiveness tracking	PUSH
`citations.py`	`memee why` + `memee cite` + briefing footer	PUSH (v2.0)
`propagation.py`	Cross-project auto-push + expanded tag inference	+68.8% IQ
`predictive.py`	Anti-pattern push (critical → ALL projects)	+36.6% IQ
`dream.py`	Nightly: propagate + connect + contradictions + promote	+24.8% IQ
`review.py`	Git diff scan vs anti-pattern + pattern DB	+11.4% IQ
`inheritance.py`	Stack+tag similarity, new project onboarding	+9.5% IQ
`embeddings.py`	sentence-transformers all-MiniLM-L6-v2 (384-dim)	Search
`reranker.py`	Cross-encoder rerank, default-on when HF cache warm	+0.0355 nDCG
`models.py`	Model family detection (8 families), diversity bonus	Multi-model
`impact.py`	Measurable ROI: time saved, iterations saved, mistakes avoided	Measurement
`plugins.py`	Hook registry for `memee-team` plugin	Extension point
`telemetry.py`	Retrieval event log (hit@1, hit@3, acceptance rate)	Quality metrics
`packs.py`	`.memee` pack export/install/verify	Distribution (v2.0)
`hooks_config.py` (root)	settings.json hook installation	DX (v2.0)
`packs_format.py` (root)	`.memee` file-level helpers (TOML/JSONL/sign)	Distribution (v2.0)
`adapters/cmam.py`	Claude Managed Agents Memory bridge (canon → CMAM)	Delivery

Removed in v2.0.0: research.py (autoresearch engine, 641 LOC), canon_ledger.py (418 LOC), evidence.py (133 LOC), tokens.py (279 LOC) — three substrate modules with zero production callers. The web dashboard at port 7878 (api/routes/dashboard.py, 556 LOC) went too. ~2,400 LOC fewer; the OSS pitch ends with "no dashboards, no copilots, no magic" and the codebase agrees now.

Note: scoping.py (personal → team → org, promotion rules, onboarding) used to live here. It has been extracted to the proprietary memee-team package alongside User/Team models, SSO, audit log, and licence verification. OSS memee is a single-user product; multi-user features live in memee-team.

Confidence Scoring

New memory: 0.5 (max uncertainty)

Validation bonuses (stackable):
  Same project, same model:        ×1.0 (base 0.08)
  Same project, different model:   ×1.3 (model diversity)
  Different project, same model:   ×1.5 (cross-project)
  Different project + model:       ×1.95 (combined max)

Invalidation: -0.12 × current (no model bonus)
Uncertainty:  1 / sqrt(evidence + 1)

Maturity: hypothesis → tested (1 app) → validated (0.7, 3 proj) → canon (0.85, 5 proj, 10 val)
Auto-deprecate: conf < 0.2 after 3 apps, OR invalidation ratio > 60%, OR unused 60 days

Source multiplier: human ×1.2, llm ×0.8, import ×0.6

Smart Router (PUSH)

NOT: dump 500 patterns into CLAUDE.md (14,550 tokens, $27K/year)
BUT: route 5-7 relevant ones per task (500 tokens, $1.1K/year = 96% savings)

Layer 0: CRITICAL anti-patterns (always, ~100 tokens)
Layer 1: Search-routed by task description (BM25+vector, ~300 tokens)
Footer:  Token count + search hint (~50 tokens)

"write unit tests" → testing + security patterns
"optimize database" → pooling + indexing + N+1 patterns
"SEO meta tags" → SEO + content optimization patterns
"GDPR audit" → compliance + consent + data deletion

60+ query expansion patterns across engineering, marketing, product,
design, data, operations. No hardcoded domains — search-based routing.

Quality Gate

Pipeline: validate → dedup → source classify → quality score

Validate:  title ≥10 chars, content ≥15 chars, content ≠ title, ≥1 tag,
           rejects TODOs/meeting notes/garbage
Dedup:     SequenceMatcher > 85% → merge into existing
Source:    human ×1.2, llm ×0.8, import ×0.6
Quality:   heuristic 1-5 (title, content depth, WHY/WHEN context, tags, actionability)
           team/org scope: quality < 2.5 = flagged

Packages — OSS ↔ paid split

Memee ships as two packages, with clear licence separation:

Package	Licence	What it adds
`memee` (this repo)	MIT	Full single-user product: every engine module, MCP server, CLI, CMAM adapter. No users, no teams, no scope enforcement.
`memee-team` (private repo, licence-gated)	Proprietary (EULA)	`User` + `Team` SQLAlchemy models, `scoping.py` engine (personal → team → org promotion), SSO (SAML/OIDC), audit log export, RBAC, licence key verification.

memee-team plugs into OSS via memee.plugins hooks (current_user_id, visible_memories, promote, can_promote, on_record). Without it installed, OSS runs as single-user and promotion raises LicenseRequiredError with an upgrade message.

Pricing (honoured on memee.eu):

Free / OSS (MIT):       $0 forever, single user, every AI feature
Team (EULA):            $49 / month flat, up to 15 seats, annual
                         + multi-user scope + SSO + audit + Postgres
Enterprise:             from $12k / year, unlimited seats, SOC 2,
                         air-gap, SLA, custom MSA

Pricing model reflects "Memee is memory, not model" — flat per-team (like Supabase, Vercel, Plausible), not per-seat (like Copilot, Cursor). Value scales sublinearly with headcount: one canon serves the whole team.

MCP Tools (19)

Core: memory_record, memory_search, search_feedback, memory_suggest, memory_validate, memory_invalidate, decision_record, antipattern_record, antipattern_check

Intelligence: propagate_patterns, predict_warnings, inherit_knowledge, run_dream, review_code, get_briefing, post_task_feedback

Analytics: learning_status, canon_list

Delivery: sync_to_cmam (push canon to Claude Managed Agents Memory)

(The five research_* tools that lived here through v1.x were removed in v2.0.0 along with the autoresearch engine.)

CMAM Bridge (Claude Managed Agents Memory)

Anthropic's managed memory is a filesystem-style store at /mnt/memory/ inside a Claude agent container. It's a dumb store — Memee stays the brain.

Memee (multi-model intelligence):        CMAM (Claude-native delivery):
  confidence + maturity                    /canon/patterns/<slug>.md
  quality gate + dedup                     /canon/lessons/<slug>.md
  cross-project propagation         ──→    /warnings/critical/<slug>.md
  token-budgeted routing                   /warnings/high/<slug>.md
  multi-model validation                   /decisions/<slug>.md
                                           /_index.md

Sync triggers: CANON maturity OR critical anti-pattern (severity=critical propagates regardless of maturity). Secrets auto-redacted. Content >100 KB auto-chunked into .part-N.md. Store caps enforced (80 MB soft, 95 MB hard, 1600/1900 count thresholds).

memee cmam sync --backend fs --local-root ~/.memee/cmam/my-store
memee cmam sync --backend api --store-id my-org          # needs ANTHROPIC_API_KEY
memee cmam sync --dry-run
memee cmam status

MCP tool sync_to_cmam lets agents trigger the push themselves.

Benchmarks

OrgMemEval v1.0: 81.2 / 88 = 92.3 % (competitors: ~2 %)

Propagation 100% | Avoidance 100% | Maturity 89% | Onboarding 100%
Recovery 100% | Calibration 83% | Synthesis 82%
(Research scenario removed in v2.0.0 with the autoresearch engine; ceiling moved 100 → 88, headline pct unchanged)

Competitive: Memee 6.5 | Mem0 3.5 | Zep 2.3 | Letta 1.3 | MemPalace 0.9

Performance: 11K inserts/s | 7.6ms BM25 | 113ms hybrid search | 10K conf updates/s

Impact (A/B test, 7 tasks):

Time: 1470min → 430min (-71%)
Iterations: 43 → 15 (-65%)
Mistakes: 14 → 0 (100% prevented)
Quality: 56% → 93% (+36pp)
ROI: 10.7x

GigaCorp (18 months, 100 agents, 200 projects):

Incidents: 12/mo → 3/mo (75% reduction)
Token savings: 501M tokens/year ($3,911)
Total ROI: 7x ($16,268 saved / $2,388 cost)

Key Files

File	Purpose
`src/memee/cli.py`	25+ Click commands (incl. `cmam sync`/`cmam status`)
`src/memee/mcp_server.py`	19 MCP tools
`src/memee/adapters/cmam.py`	Claude Managed Agents Memory bridge
`src/memee/storage/models.py`	15 SQLAlchemy models
`src/memee/storage/database.py`	DB init, FTS5, WAL mode

Tests

pytest tests/ -v   # 201 tests, ~67s

Simulation tests: test_company_simulation (NovaTech 6mo), test_enterprise (TechCorp 52wk), test_megacorp (100 proj, hallucination defense), test_gigacorp (200 proj, 18 months), test_benchmarks (competitive), test_blind_spots (14 failure modes), test_real_impact (A/B with/without), test_perf_simulation (9 scenarios)

Project Stats

33 commits on feat/initial-setup
63 Python files, 18,899 lines of code
201 tests passing
16 engine modules + CMAM adapter, 19 MCP tools, 12+ API endpoints (GET-only dashboard API)
MIT licence (OSS memee), proprietary EULA for memee-team

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memee — Institutional Memory for AI Agent Companies

Architecture

Quick Commands

Engine Modules (15) + Adapters

Confidence Scoring

Smart Router (PUSH)

Quality Gate

Packages — OSS ↔ paid split

MCP Tools (19)

CMAM Bridge (Claude Managed Agents Memory)

Benchmarks

Key Files

Tests

Project Stats

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

Memee — Institutional Memory for AI Agent Companies

Architecture

Quick Commands

Engine Modules (15) + Adapters

Confidence Scoring

Smart Router (PUSH)

Quality Gate

Packages — OSS ↔ paid split

MCP Tools (19)

CMAM Bridge (Claude Managed Agents Memory)

Benchmarks

Key Files

Tests

Project Stats