| name | Claude Quick Start |
|---|---|
| description | Fast reference for using the memory/caching/RAG system |
| type | reference |
TL;DR: Three systems optimize token usage and project continuity.
- Cached files load automatically (CLAUDE.md, memory system, architecture)
- Read ACTIVE_RULES.md (2 confirmed rules, apply automatically)
- Skim OBSERVATIONS.md (any new patterns being tested?)
- You're ready ✓
User: "How does the SMS OTP verification work?"
You:
- Read RAG_INDEX.md (find "OTP" → Authentication queries)
- Load mapped files only:
backend/lambda_/sms_otp_handler.pybackend/lambda_/auth_handler.pyfrontend/src/components/AuthCallback.jsx
- Answer (focused, ~5K tokens)
Don't: Load entire codebase (40K+ tokens)
User: "Why did we choose DynamoDB?"
You:
- Open
memory/DECISIONS.md - Find entry: "Database: DynamoDB"
- Read: reasoning, trade-offs, revisit date
- Answer with authority
Benefit: No redebates. Decision is documented with rationale.
Before claiming session complete:
-
Check
memory/OBSERVATIONS.md- Add new patterns observed
- Note if existing patterns were confirmed
-
Check if any observations should escalate to rules
- Observed 5+ times → escalate to ACTIVE_RULES
- 2-3 times → keep in observations
-
Update
memory/PROJECT_STATUS.mdif tests/metrics changed -
Commit memory changes to git
| Need | File | How |
|---|---|---|
| Your preferences | CLAUDE.md |
Read at start (cached) |
| Confirmed rules | ACTIVE_RULES.md |
Read at start, apply auto |
| Code location | RAG_INDEX.md |
Consult per question |
| Why decision X? | memory/DECISIONS.md |
Lookup when revisiting |
| Test metrics | memory/PROJECT_STATUS.md |
Check quarterly |
| Architecture | docs/architecture.md |
Consult when needed (cached) |
→ RAG_INDEX.md → Find domain → Load relevant files
→ memory/DECISIONS.md → Find database decision → Read reasoning
→ RAG_INDEX.md → Testing queries → Load test + component files
→ RAG_INDEX.md → Frontend/UI → Load component + guidelines
→ RAG_INDEX.md → Infrastructure queries → Load CDK + architecture
Per session: 100K tokens max (configured in settings.json)
Reality: Most sessions use 5-20K (caching + RAG saves 80-90%)
Alert: If > 80K tokens, check why:
- Are you loading full codebase instead of RAG files?
- Should more files be in RAG_INDEX.md?
- Adjust cache strategy?
- Week 1: Get familiar with RAG_INDEX.md, consult it for every code question
- Week 2-3: Rules 1-2 become automatic, observe new patterns
- Week 3-4: Document patterns, first escalations to rules expected
- Week 4+: 3-5 new rules confirmed, system becomes highly personalized
Expected result after 1 month: 20-30 confirmed rules guiding decisions automatically
When adding code:
- Add file to RAG_INDEX.md (take 30 seconds)
- Add tags (auth, testing, database, etc.)
When code is deleted:
- Remove from RAG_INDEX.md
When codebase exceeds 100K LOC:
- Consider semantic embeddings (future enhancement)
- For now, keyword-based RAG is sufficient
"Token usage too high" → Check RAG_INDEX.md is being used → Load only mapped files, not entire codebase → Verify .claudeignore is excluding node_modules, venv, dist
"Decisions keep being redebated" → Add to memory/DECISIONS.md with reasoning + revisit date → Link to decision when it comes up again
"New patterns not becoming rules" → Need 5+ independent confirmations → Document in OBSERVATIONS.md with dates → Check after 2-3 more sessions
"Memory system feels slow" → It shouldn't—everything is cached → If MEMORY.md load is slow, file is too large (trim observations)
| Situation | Action |
|---|---|
| Code question | RAG_INDEX.md → Load relevant files |
| Architecture question | docs/architecture.md (cached) |
| Decision question | memory/DECISIONS.md |
| New pattern observed | Add to memory/OBSERVATIONS.md |
| Session complete | Update PROJECT_STATUS.md |
| Forgot a rule | Check ACTIVE_RULES.md (cached) |
| Want full context | Read CONTEXT_MANAGEMENT.md |
RAG_INDEX.md— Full query → files mappingRAG_USAGE.md— How RAG retrieval worksCONTEXT_MANAGEMENT.md— System architecturememory/HOW_TO_USE.md— Memory system workflowmemory/SYSTEM_OVERVIEW.md— Learning + caching diagrammemory/CACHING_STRATEGY.md— Token optimization details
Remember: First session reads this file. Subsequent sessions use the system automatically.