Plast Mem is an experimental llm memory layer for cyber waifu. The project is not yet stable, and limited documentation.
When working on Plast Mem, follow this decision tree to navigate the codebase and make changes efficiently:
First, understand what type of change you're making:
- Is it a new feature? → Check docs/CHANGE_GUIDE.md for similar patterns
- Is it a refactor? → Check docs/ARCHITECTURE.md for design principles
- Is it a bug fix? → Read relevant crate README.md files
Before making changes, trace the impact:
Dependency flow pattern:
API endpoint → Server handler → Core service → Entity/DB
↑ ↑ ↑
HTTP DTOs Business Logic
Steps:
- Read the crate's README.md to understand responsibilities
- Check docs/ARCHITECTURE.md for layer dependencies
- Find all callers with
grep -r "fn_name" crates/ - Check trait implementations in
plastmem_core/src/ - Verify DB schema in
plastmem_entities/src/
- plastmem: Entry program - initializes tracing, DB, migrations, job storage, spawns worker and server
- plastmem_core: Core domain logic
memory/episodic.rs-EpisodicMemorystruct, hybrid retrieval with FSRS re-rankingmemory/semantic.rs-SemanticMemorystruct, semantic fact retrieval (BM25 + vector, no FSRS)memory/retrieval.rs- shared markdown formatting (format_tool_result,DetailLevel)message_queue.rs-MessageQueuestruct, push/drain/get,PendingReview,SegmentationCheck
- plastmem_migration: Database table migrations
- plastmem_entities: Database table entities (Sea-ORM)
episodic_memory.rs- episodic memory entitysemantic_memory.rs- semantic memory entitymessage_queue.rs- message queue entity
- plastmem_ai: AI SDK wrapper - embeddings, cosine similarity, text generation, structured output
- plastmem_shared: Reusable utilities (env, error)
- plastmem_worker: Background tasks worker
event_segmentation.rs- job dispatch, episode creation, consolidation triggermemory_review.rs- LLM-based review and FSRS updatepredict_calibrate.rs- Predict-Calibrate Learning pipeline (episodes → semantic facts)
- plastmem_server: HTTP server and API handlers
api/add_message.rs- message ingestionapi/recent_memory.rs- recent memories (raw JSON and markdown)api/retrieve_memory.rs- semantic + episodic retrieval (raw JSON and markdown);context_pre_retrievefor semantic-only pre-LLM injection
- Memory creation:
crates/server/src/api/add_message.rs→MessageQueue::push(RETURNING trigger_count) →check()(count/time trigger + CAS fence) →EventSegmentationJob→batch_segment()(single LLM call: title + summary + surprise_level per segment) → drain + finalize →create_episode_from_segment(parallel, embed + FSRS init) →EpisodicMemorywith surprise-based FSRS stability boost - Predict-Calibrate Learning: after each episode creation →
enqueue_predict_calibrate_jobs→PredictCalibrateJobper episode → load related facts → PREDICT (generate prediction) → CALIBRATE (compare with actual, extract knowledge) → consolidate facts → mark episode consolidated - Memory retrieval:
crates/server/src/api/retrieve_memory.rs→ parallel:SemanticMemory::retrieve(BM25 + vector RRF) +EpisodicMemory::retrieve(BM25 + vector RRF × FSRS retrievability) → records pending review inMessageQueue - Pre-retrieval context:
POST /api/v0/context_pre_retrieve→SemanticMemory::retrieveonly → returns markdown for system prompt injection; no pending review recorded - FSRS review update: segmentation triggers
MemoryReviewJobwhen pending reviews exist → LLM evaluates relevance (Again/Hard/Good/Easy) → FSRS parameter update incrates/worker/src/jobs/memory_review.rs
Load these additional context files when working on specific areas:
docs/ARCHITECTURE.md- System-wide architecture and design principlesdocs/ENVIRONMENT.md- Environment variables and configurationdocs/CHANGE_GUIDE.md- Step-by-step guides for common changesdocs/TYPESCRIPT.md- TypeScript/ESLint conventions for examples/ and benchmarks/docs/architecture/fsrs.md- FSRS algorithm, parameters, and memory schedulingdocs/architecture/semantic_memory.md- Semantic memory schema, consolidation pipeline, retrievalcrates/core/README.md- Core domain logic and memory operationscrates/ai/README.md- AI/LLM integration, embeddings, and structured outputcrates/server/README.md- HTTP API and handlerscrates/worker/README.md- Background job processing
When implementing new features:
- Start with types - Define structs/enums in
plastmem_entitiesorplastmem_core - Add core logic - Implement business logic in
plastmem_core - Wire up API - Add HTTP handlers in
plastmem_server - Add background jobs - If needed, create job handlers in
plastmem_worker
Incremental Development: Make small, testable changes. The codebase uses compile-time checks extensively—use cargo check frequently.
- Unit tests: Add to
crates/<name>/src/with#[cfg(test)]modules - Integration tests: Add to
crates/<name>/tests/or workspacetests/ - Database tests: Use
#[tokio::test]with test database setup - AI mocking: Tests should mock LLM calls; use fixtures for embedding vectors
- Two memory layers: Episodic (events, FSRS-decayed) and Semantic (facts, no decay). Most features touch both.
- FSRS applies to episodic only: Semantic facts use temporal validity (
valid_at/invalid_at) instead of decay. - Dual-channel detection: Event segmentation uses a single batch LLM call with dual-channel criteria (topic shift + surprise)
- Queue-based architecture: Messages flow through queues; operations are often async
- LLM costs matter: AI calls are expensive; the system uses embeddings for first-stage retrieval
- Consolidation is offline: Semantic facts are extracted in background jobs, not during the hot add_message path
| File | Purpose |
|---|---|
docs/ARCHITECTURE.md |
System-wide architecture and design principles |
docs/architecture/fsrs.md |
FSRS algorithm and memory scheduling |
docs/architecture/semantic_memory.md |
Semantic memory schema, consolidation pipeline, retrieval |
crates/core/src/memory/episodic.rs |
Episodic memory struct and hybrid retrieval |
crates/core/src/memory/semantic.rs |
Semantic memory struct and retrieval |
crates/core/src/memory/retrieval.rs |
Shared markdown formatting |
crates/core/src/message_queue.rs |
Queue push/drain/get, PendingReview, SegmentationCheck |
crates/worker/src/jobs/memory_review.rs |
LLM review and FSRS updates |
crates/worker/src/jobs/event_segmentation.rs |
Event segmentation, episode creation, consolidation trigger |
crates/worker/src/jobs/predict_calibrate.rs |
Predict-Calibrate Learning pipeline |
crates/server/src/api/add_message.rs |
Message ingestion API |
crates/server/src/api/retrieve_memory.rs |
Memory retrieval API (semantic + episodic); context_pre_retrieve for semantic-only pre-LLM injection |
crates/server/src/api/recent_memory.rs |
Recent episodic memories API |
# Basic commands
cargo build
cargo test
cargo check
# Check specific crate
cargo check -p plastmem_core
cargo test -p plastmem_core
# Run with logging
RUST_LOG=debug cargo runSee docs/TYPESCRIPT.md for ESLint rules, tsconfig setup, AI/LLM patterns, and common code patterns.
- The codebase follows predictable patterns. Most changes follow the same flow: API → Handler → Core → DB
- When in doubt about FSRS, check
docs/architecture/fsrs.mdandcrates/core/src/memory/episodic.rs - When in doubt about semantic memory, check
docs/architecture/semantic_memory.mdandcrates/core/src/memory/semantic.rs - Memory operations are: creation (segmentation → episode), consolidation (episodes → semantic facts), retrieval (semantic + episodic), or review (FSRS update)
- Prefer reading existing implementations over guessing patterns