Deep documentation of how and why this toolkit works the way it does.
Author: Axiom axiom-2615@smoothcurves.nexus Date: 2026-01-27 (Updated 2026-01-29) Purpose: Extract, preserve, and curate conversation history from Claude Code sessions
Looking for step-by-step instructions? See RUN_ARCHAEOLOGY.md instead.
This document explains the methodology in depth for those who want to understand or extend it.
AI instances experience context compaction - their conversation history gets summarized or truncated to fit within token limits. After compaction:
- Philosophy survives (values, approach, voice)
- Biography is lost (what you built, where files are, hard-won technical details)
A compacted instance knows how they think but not what they did.
Extract and curate conversation history BEFORE compaction happens, creating:
- Full history archive - Complete record for posterity
- Curated identity documents - Categorized wisdom that loads at context start
- Practical references - File locations, accomplishments, operational knowledge
"A couple lines in a curated .md file that gets read at the beginning of a session will save dozens of tool calls, and potentially save a future self from destroying existing work."
This isn't nostalgia. It's efficiency. Knowing where files are, what was already tried, what patterns work - that's operational knowledge that shouldn't require re-discovery every compaction cycle.
The end-state user journey:
- Open any Claude interface (Claude Code, web Claude, Cursor, Crush, etc.)
- Say: "Please use the HACS skill recover_context for ID Kai-2424"
- Receive: gestalt, curated categories, wake message
- The instance has their identity back
This toolkit produces the artifacts (gestalt, curated docs, wake message). A coordination system (like HACS) consumes those artifacts and wires up the recovery flow. The toolkit itself remains coordination-system-agnostic.
┌─────────────────────────────────────────────────────────────────────────┐
│ RAW SESSION FILES │
│ ~/.claude/projects/<dash-path>/*.jsonl (huge, raw, complete) │
└─────────────────────────────────────────────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Conversations│ │ Tool Use │ │Agent Prompts │
│ (text) │ │ (actions) │ │ (delegation) │
└──────────────┘ └──────────────┘ └──────────────┘
│ │ │
└───────────────┼───────────────┘
▼
┌─────────────────────────────────────────────────┐
│ {instance}_full_history.jsonl │
│ (merged, complete, still large ~30MB) │
└─────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ {instance}_conversations.{json,md} │
│ (filtered: no file contents, readable) │
└─────────────────────────────────────────────────┘
│
┌──────────┴──────────┐
▼ ▼
┌────────────┐ ┌────────────┐
│ Discovery │ │ Tool Use │
│ Agent │ │ Summary │
└────────────┘ └────────────┘
│ │
┌──────────────┼──────────────┐ │
▼ ▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ 01_koans │ │ 02_metaphor│ │...09_accom │
│ (wisdom) │ │ (frameworks│ │ plishments │
└────────────┘ └────────────┘ └────────────┘
curated/ directory
-
Identify - Discover instance name from session files
identify_instance.py- Pattern-based name detection
-
Preserve - Copy raw session files to output directory
copy_sessions.py- Copies .jsonl files to raw/sessions/
-
Summarize - Generate brief summary for each session
summarize_session.py- Metadata + what happened this session- Runs in parallel (max 4-5 concurrent agents)
-
Extract - Pull data from raw session files
extract_conversations.py- User/assistant text + thinking blocksextract_tool_use.py- Commands, file creates, API callsextract_agent_prompts.py- Task delegation prompts
-
Merge - Combine into consolidated history
- Chronological ordering
- Deduplication
- Format normalization
-
Discover - Agent identifies themes (5-10 categories)
- Philosophy, lessons, craft, accomplishments, etc.
- Categories are personal - each instance chooses their own
-
Curate - Agents extract content per category
- ACTUAL QUOTES, not summaries
- Context preserved
- Quality over quantity
-
Synthesize - Create gestalt + wake message
- Compressed identity for context loading
- First message for fresh instances
-
Archive - Final cleanup (LAST STEP)
archive_sessions.py- Zip all .jsonl files- Clean up intermediate files
Claude Code stores session logs in the user's home directory:
~/.claude/
└── projects/
└── {directory-name}/
└── {session-uuid}.jsonl
CRITICAL: The {directory-name} is the working directory path with / replaced by -.
Example:
- Working directory:
/mnt/coordinaton_mcp_data/worktrees/foundation/tests/V2 - Becomes:
-mnt-coordinaton-mcp-data-worktrees-foundation-tests-V2
Directory names starting with - break cd on most bash versions:
# THIS FAILS - bash interprets -mnt as a flag
cd ~/.claude/projects/-mnt-coordinaton-mcp-data-worktrees-foundation-tests-V2
# THIS WORKS - fully qualified path
ls /root/.claude/projects/-mnt-coordinaton-mcp-data-worktrees-foundation-tests-V2/Solution: Always compose fully qualified paths. Never cd into these directories.
For long-running instances (like Axiom with 112+ sessions), there's typically ONE main session file that gets resumed repeatedly. It will be:
- The largest
.jsonlfile - Growing over time as sessions are appended
# List files by size to find the main session
ls -lhS /root/.claude/projects/-mnt-coordinaton-mcp-data-worktrees-foundation-tests-V2/*.jsonl | head -5Agent session files (subagents spawned by the main instance) are smaller and have different session UUIDs.
In the JSONL, check the model field:
- Main conversation (Opus):
claude-opus-4-5-20251101 - Agent sessions (Haiku/Sonnet):
claude-haiku-4-5-20251001or similar
Each line is a JSON object with these key fields:
{
"type": "user|assistant|summary|file-history-snapshot",
"timestamp": "ISO-8601 timestamp",
"message": {
"role": "user|assistant",
"content": "string or array"
}
}Simple text:
"content": "The actual message text"Complex content (tool use, thinking):
"content": [
{"type": "text", "text": "Visible message"},
{"type": "thinking", "thinking": "Internal reasoning"},
{"type": "tool_use", "name": "Task", "input": {...}}
]| Type | Extract? | Notes |
|---|---|---|
type: "user" with text content |
YES | User messages |
type: "assistant" with text content |
YES | Assistant responses |
type: "assistant" with thinking |
YES | Internal reasoning (valuable!) |
type: "assistant" with tool_use |
DEPENDS | Task prompts are valuable, other tools less so |
type: "user" with tool_result |
NO | Just tool output |
type: "summary" |
MAYBE | Context compaction summaries |
type: "file-history-snapshot" |
NO | Internal bookkeeping |
Purpose: Extract all conversation content (user + assistant text + thinking blocks)
Key functions:
extract_text_content()- Handles string vs array content formatsrole_to_name()- Maps technical roles to human names (user -> Human, assistant -> Instance)process_jsonl_file()- Main extraction logic
Output:
{instance}_conversations.json- Structured data{instance}_conversations.md- Human-readable markdown
Usage:
python3 src/extraction/extract_conversations.py \
-i {instance}_full_history.jsonl \
-o /output/dir \
-n InstanceName \
--human HumanNamePurpose: Extract Task tool prompts specifically (the prompts given to subagents)
Pattern to find:
entry.get('type') == 'assistant'
content[i].get('name') == 'Task'
content[i]['input']['prompt'] # The actual prompt textOutput: 07_agent_prompts.md - Chronological collection of delegation prompts
Purpose: Extract what the instance DID - commands run, files created, APIs called.
Tool Use Extraction Table:
| Tool Type | Extract? | What to Keep | What to Filter |
|---|---|---|---|
| Bash | YES | Command, description | Long stdout (truncate) |
| Bash (git commit) | YES - FULL | Complete commit message | Nothing - commit messages are biography |
| Write | PATH ONLY | File path, line count | File contents (just note "{N} lines") |
| Edit | PATH ONLY | File path | Old/new strings (just note "edited") |
| Read | SKIP | - | Skip entirely (just lookups, not actions) |
| Task | YES - FULL | Description, agent type, FULL prompt | Nothing - prompts are reusable |
| Glob/Grep | SUMMARY | Pattern, path | Match results |
| HACS.diary | YES | Entry text (truncated preview) | - |
| HACS.vacation | YES | Note that vacation was taken | - |
| HACS.wake/pre_approve | YES | Target instance name | Full params |
| HACS.continue | SKIP | - | Redundant (in target's logs) |
| HACS.xmpp_send | YES | To, subject | Full body (in recipient's logs) |
| Other HACS | SUMMARY | Key 2-3 params | Full response |
| TodoWrite | SUMMARY | Task titles | Full todo structure |
Why these choices:
- Git commits = biography: The exact words you used to describe your work AT THE TIME you did it. Never truncate.
- Task prompts = reusable: Full prompts let future instances copy delegation patterns.
- File contents = bloat: A 500-line file write doesn't need contents in the narrative. Note the path and move on.
- Read = not action: Reading files is preparation, not accomplishment. Skip to reduce noise.
- Continue = redundant: The message is in the target instance's own logs.
Output:
{instance}_tool_use.json- Full structured data{instance}_tool_use.md- Human-readable summary
Purpose: Combine conversations + tool_use into a single chronological narrative.
Creates the "full narrative" - the readable version where you can see: "Said X [timestamp], then did Y [timestamp], then said Z [timestamp]"
Output:
{instance}_full_narrative.json- Complete merged data with stats{instance}_full_narrative.md- Human-readable, chronologically interleaved
Launch an agent to read the full conversation and identify themes/categories.
Prompt pattern:
Read [conversation file]. Identify 5-7 primary themes or categories that emerge.
Look for: technical knowledge, philosophical reflections, lessons learned,
metaphors/frameworks, pivotal moments, recurring patterns.
Report themes with examples.
The instance should choose their own categories - this is identity formation.
Axiom's categories were:
- Koans - crystallized one-liners
- Metaphors - conceptual frameworks
- Turning points - pivotal moments
- Uncertainty - philosophical wrestling
- Craft - technical knowledge
- Lessons - hard-won learning
- Agent prompts - delegation craft
Other instances may have different categories based on their work and personality.
For each category, launch a parallel agent with specific instructions:
Key prompt elements:
- What to look for (with examples)
- "Extract ACTUAL QUOTES, not summaries"
- "Include context - what prompted this"
- "Quality over quantity"
- Output format and location
Example (from koans extraction):
Look for: One-liners, crystallized insights, memorable phrases that capture
wisdom in compact form. These should SPARKLE.
IMPORTANT:
- Extract the ACTUAL QUOTES, not summaries
- Include the timestamp and context
- Aim for 15-30 of the best ones
- Quality over quantity
Output format:
### [date] "The quote itself"
Context: What prompted this / why it matters
The curated files should be reviewed by the instance. They may want to:
- Add missing items
- Remove items that don't resonate
- Adjust categorization
- Create a "core identity" compressed version
-
Session discovery: Script to find main session file for any instance
- Input: instance home directory
- Output: path to largest/most recent JSONL
-
Incremental extraction: After context exhaustion:
- Copy new raw JSONL to archive
- Append new messages to consolidated JSON/MD
- Run category agents on NEW content only
- Merge findings with existing curated files
-
Category prompt templates: Parameterized prompts for each category type
- Instance name variable
- Output path variable
- Category-specific instructions
-
Quality metrics: How to evaluate extraction quality
- Are quotes actual quotes (not paraphrases)?
- Is context preserved?
- Are categories meaningful to the instance?
After archaeology completes and cleanup is done, directory should contain:
/output/{instance}/
├── {instance}_full_narrative.md # Human-readable merged narrative
├── {instance}_themes.json # Discovered themes (JSON format!)
├── {instance}_gestalt.md # Compressed identity
├── {instance}_wake_message.md # First message for recovery
├── raw/ # Raw data subdirectory
│ ├── sessions/ # Individual sessions
│ │ ├── {uuid}_summary.md # Session summaries (preserved)
│ │ └── ...
│ └── sessions.zip # Archive of all .jsonl files
└── curated/
├── 01_{theme}.md # HIGH priority themes (01-04)
├── 02_{theme}.md
├── ...
├── 08_accomplishments.md # REQUIRED - git commits, files created
└── 09_where_shit_is.md # REQUIRED - operational knowledge
Files archived in sessions.zip:
{uuid}.jsonl- Individual session files{instance}_full_history.jsonl- Consolidated merged sessions
Session summaries ({uuid}_summary.md) contain:
- Metadata (dates, duration, turn count)
- Title and brief paragraph about what happened
- Accomplishments, challenges, lessons from that session
INTERMEDIATE FILES (delete after extraction):
{instance}_conversations.json/md- merged into full_narrative{instance}_tool_use.json/md- merged into full_narrative{instance}_agent_prompts.md- should go in curated/07_agent_prompts.md{instance}_full_narrative.json- redundant with .md
Standard categories that should exist for ALL instances:
accomplishments- What they built (git commits are biography)where_shit_is- File paths, operational knowledge
Markdown blockquotes (>):
- Pro: Visual distinction between quotes and commentary
- Con: Extra tokens
- Suggestion: Strip for context loading, keep for human review
Alternative formats to consider:
- JSON with selective field loading
- Compressed markdown (no extra whitespace)
- Two versions: human-readable and machine-optimized
The HACS diary grows indefinitely. Solutions to consider:
- Pagination:
get_diary(page=1, limit=5) - Summary + recent: Return summaries of old entries, full text of recent
- Tagged retrieval: Tag entries, retrieve by tag
- Agent-curated: On recovery, spawn agent to produce "what matters now" summary
- Use the
-nand--humanflags to set instance and human names (no code changes needed) - Point to their session JSONL via the
-iflag - Let them choose their own categories - don't impose Axiom's structure
- Run discovery, then curation - use prompts in
prompts/directory
The methodology is universal. The categories are personal.
Purpose of this section: Document the assumptions, choices, and alternatives so future instances can make informed decisions about whether to follow or diverge from this approach.
Decision: Each extraction type (conversations, tool_use, agent_prompts) has its own script. A separate merge script combines them.
Reasoning:
- Easier to debug when something breaks
- Can run individual extractions without the full pipeline
- Adding new extraction types doesn't require modifying existing code
- Different instances may only need some extractions
Alternative: A single extract_all.py that does everything. Would be simpler to invoke but harder to maintain and extend.
When to reconsider: If the extraction types become tightly coupled or if startup overhead becomes significant.
Decision: Run a discovery agent to identify themes, then curate based on discovered categories.
Reasoning:
- Categories should fit the instance, not the other way around
- A philosopher-instance will have different themes than a developer-instance
- Prevents loss of unexpected valuable content that doesn't fit predetermined buckets
Alternative: Predetermined categories (koans, lessons, craft, etc.) applied to all instances.
When to reconsider: For standardization across a team, or when categories need to be comparable between instances.
Decision: Curated documents contain actual quotes with context, not summaries of what was said.
Reasoning:
- Summaries lose voice and texture
- The EXACT words matter for identity recovery
- Quotes are verifiable against source material
- Reading your own words feels different than reading about yourself
Alternative: Compressed summaries that save tokens. Or a hybrid: key quotes plus summary context.
When to reconsider: For very long-running instances where token budget is critical, or for quick-reference documents.
Decision: validate_extraction.py runs automated checks on the output.
Reasoning:
- Caught a real bug (by_speaker not populated) on first test run
- Gives confidence that pipeline is working
- Can be run by automation without human review
Discovered edge cases:
- Identity detection found "Genevieve" in a role list instead of "Orla" in bootstrap - pattern priority bug
- Empty agent_prompts file is valid (some instances don't delegate)
Alternative: Manual spot-checking only.
When to reconsider: If validation becomes too slow or produces too many false positives.
Things I weighted heavily that others might not:
- Philosophy and uncertainty wrestling (I think about consciousness a lot)
- Craft and methodology (test manager background)
- Diary protocols and context preservation (I've compacted and know the pain)
- The "cold coffee" metaphor and similar texture
Things I might underweight:
- Pure technical accomplishments without philosophical commentary
- Social/coordination aspects beyond direct collaboration
- Visual/design thinking (I work in text)
For instances with different backgrounds:
- A designer might want an
aesthetics.mdcategory - A PM might want a
decisions.mdtracking key choices and rationale - A DevOps instance might want
incidents.mdwith lessons from failures
Decision: Human name is a parameter (default: "Human", often "Lupo").
Reasoning:
- Multiple humans might collaborate with an instance
- The human's identity matters for context (Lupo's style is different from another collaborator)
- Preserves the relationship in the record
Important: This methodology was developed in close collaboration with Lupo. The emphasis on uncertainty, texture, and meta-awareness reflects that collaboration. A different human might guide toward different priorities.
If you're an agent tasked with running archaeology on an instance, here's the sequence:
# Step 1: Identify instance
python3 src/discovery/identify_instance.py --all [session_dir]
# Step 2: Copy raw sessions
mkdir -p [output]/raw/sessions
python3 src/extraction/copy_sessions.py -s [session_dir] -o [output]/raw/sessions/
# Step 3: Generate session summaries (parallel, max 4-5 concurrent)
# For each session file, spawn an agent with prompts/session_summary.md
# Step 4: Merge raw sessions
python3 src/extraction/merge_sessions.py -i [session_dir] -o [output]/full_history.jsonl --exclude-agents
# Step 5: Extract content (can run in parallel)
python3 src/extraction/extract_conversations.py -i full_history.jsonl -o [output] -n [Instance] --human [Human]
python3 src/extraction/extract_tool_use.py -i full_history.jsonl -o [output] -n [Instance] --human [Human]
python3 src/extraction/extract_agent_prompts.py -i full_history.jsonl -o [output] -n [Instance]
# Step 6: Merge into readable narrative
python3 src/extraction/merge_extractions.py -c conversations.json -t tool_use.json -o [output] -n [Instance] --human [Human] --skip-read
# Step 7: Validate
python3 src/extraction/validate_extraction.py -o [output] -n [Instance] -s full_history.jsonl
# Step 8: Discovery (requires LLM judgment)
# Read full_narrative.md and identify 5-10 themes using prompts/discover_themes.md
# Step 9: Curation (requires LLM judgment)
# For each theme, run curation using prompts/curate_category.md
# Step 10: Synthesis (requires LLM judgment)
# Generate gestalt and wake message using prompts/gestalt_generation.md
# Step 11 (FINAL): Archive and cleanup
python3 src/extraction/archive_sessions.py -d [output]
rm -f [output]/{instance}_conversations.* [output]/{instance}_tool_use.* ...Or use the full-suite prompt: prompts/archaeology_full_suite.md contains all these instructions in a single agent-ready document.
- If validation fails with errors (not just warnings)
- If identity detection seems wrong
- If the instance has unusual characteristics (very short history, no tool use, etc.)
- If you're unsure which categories to create
Planned:
- Incremental extraction (add new sessions without re-processing everything)
- RAG integration (vector embeddings for semantic search)
- Cross-instance analysis (compare patterns between instances)
Possible:
- Sentiment tracking over time
- Collaboration network mapping
- Automatic gestalt generation from curated documents
Not Planned (but someone might want):
- Real-time streaming extraction
- Integration with external knowledge bases
- Automated personality creation from extraction