Skip to content

Latest commit

 

History

History
238 lines (182 loc) · 10 KB

File metadata and controls

238 lines (182 loc) · 10 KB

Context Management

Overview

Imagine the LLM's context window as a backpack with limited capacity 🎒. Every conversation turn, every tool call result adds something to the backpack. As the conversation goes on, the backpack gets fuller and fuller...

Context management is a set of mechanisms that help you "manage your backpack", ensuring the AI can work continuously and efficiently.

graph TB
    A[Context Management] --> B[Structure Division]
    A --> C[Token Monitoring]
    A --> D[Compaction Mechanism]

    B --> B1[System Prompt]
    B --> B2[Compactable Zone]
    B --> B3[Reserved Zone]

    D --> D1[Tool Result Compaction]
    D --> D2[Conversation Compaction]
Loading

The context management mechanism is inspired by OpenClaw and implemented by ReMe.

Context Structure

CoPaw divides the context into three zones:

graph LR
    A[System Prompt] -->|Always retained| B[Compactable Zone<br>Compactable Messages]
    B -->|Compress when exceeded| C[Reserved Zone<br>Recent Messages]
Loading
Zone Description Handling
System Prompt The AI's "role definition" and base instructions Always retained, never compacted
Compactable Zone Historical conversation messages Token counted; compacted into summary when threshold exceeded
Reserved Zone Most recent N messages Kept as-is, ensuring context continuity

Structure Example

┌─────────────────────────────────────────┐
│ System Prompt (Fixed)                    │  ← Always retained
│ "You are an AI assistant..."             │
├─────────────────────────────────────────┤
│ Compacted Summary (Optional)             │  ← Generated after compaction
│ "Previously helped user complete login..."│
├─────────────────────────────────────────┤
│ Compactable Zone                         │  ← Compacted when exceeded
│ [Message 1] User: Help me build login    │
│ [Message 2] Assistant: Sure, I'll...     │
│ [Message 3] Tool call result...          │
│ ...                                      │
├─────────────────────────────────────────┤
│ Reserved Zone                            │  ← Always retained
│ [Message N-2] User: Add registration     │
│ [Message N-1] Assistant: Sure...         │
│ [Message N] User: Done!                  │
└─────────────────────────────────────────┘

Management Mechanism

Architecture Overview

graph LR
    Agent[Agent] -->|Before each inference| Hook[MemoryCompactionHook]
    Hook --> TC[compact_tool_result<br>Compress tool output]
    TC --> CC[check_context<br>Token counting]
    CC -->|Exceeds limit| CM[compact_memory<br>Generate summary]
Loading

Related Code

Execution Flow

graph LR
    M[messages] --> TC[compact_tool_result<br>Compress long tool output]
    TC --> CC[check_context<br>Calculate remaining space]
    CC --> D{messages_to_compact<br>not empty?}
    D -->|No| K[Return original messages + original summary]
    D -->|Yes| V{is_valid?}
    V -->|No| K
    V -->|Yes| CM[compact_memory<br>Generate summary]
    CM --> R[Return messages_to_keep + new summary]
Loading

Execution Order:

  1. compact_tool_result — Compress long tool outputs (if enabled)
  2. check_context — Check if context exceeds limits
  3. compact_memory — Generate compaction summary

Compaction Mechanism

When the context approaches its limit, CoPaw automatically triggers compaction, condensing old conversations into a structured summary.

1. compact_tool_result — Tool Result Compaction

When enable_tool_result_compact is enabled, long tool outputs are automatically compressed:

graph LR
    M[messages] --> L{Iterate tool_result<br>len > threshold?}
    L -->|No| K[Keep as-is]
    L -->|Yes| T[Truncate to threshold]
    T --> S[Write full content to<br>tool_result/uuid.txt]
    S --> R[Append file path reference to message]
    R --> C[Clean up expired files]
Loading
  • Full content is saved to the tool_result/ directory
  • Truncated content + file path reference is kept in the message
  • Expired files are automatically cleaned up

2. check_context — Context Check

Determines if context exceeds limits based on token counting, automatically splitting messages into "to compact" and "to keep" groups.

graph LR
    M[messages] --> H[Token counting]
    H --> C{total > threshold?}
    C -->|No| K[Return all messages]
    C -->|Yes| S[Keep from tail backwards<br>reserve tokens]
    S --> CP[messages_to_compact<br>Early messages]
    S --> KP[messages_to_keep<br>Recent messages]
    S --> V{is_valid<br>Tool call alignment?}
Loading
  • Core Logic: Reserve memory_compact_reserve tokens from the tail backwards, marking excess as to-be-compacted
  • Integrity Guarantee: Does not split user-assistant conversation pairs or tool_use/tool_result pairs

3. compact_memory — Conversation Compaction

Uses ReActAgent to compress historical conversations into a structured context summary:

graph LR
    M[messages] --> H[format_msgs_to_str]
    H --> A[ReActAgent<br>reme_compactor]
    P[previous_summary] -->|Incremental update| A
    A --> S[Structured summary]
Loading

4. Manual Compaction (/compact Command)

Proactively trigger compaction:

/compact

After execution, you'll see:

**Compact Complete!**

- Messages compacted: 12
**Compressed Summary:**
<compacted summary content>

Response breakdown:

  • 📊 Messages compacted - How many messages were compacted
  • 📝 Compressed Summary - The generated summary content

Compaction Summary Structure

The compacted summary is a structured context summary, containing all the key information needed to continue working:

graph TB
    A[Compacted Summary] --> B[Goal]
    A --> C[Constraints]
    A --> D[Progress]
    A --> E[Key Decisions]
    A --> F[Next Steps]
    A --> G[Critical Context]
Loading
Field Content Example
Goal What the user wants to accomplish "Build a user login system"
Constraints Requirements and preferences "Use TypeScript, no frameworks"
Progress Completed / in-progress / blocked tasks "Login API done, registration API in progress"
Key Decisions Decisions made and their rationale "Chose JWT over Sessions for statelessness"
Next Steps What to do next "Implement password reset feature"
Critical Context Data needed to continue work "Main file is at src/auth.ts"
  • Incremental Update: When previous_summary is provided, new conversations are automatically merged with the old summary
  • Information Preservation: Compaction preserves exact file paths, function names, and error messages, ensuring seamless context transitions

Configuration

Configuration is located in ~/.copaw/config.json under agents.running:

Context Management Parameter Default Description
max_input_length 131072 Model context window size (tokens), i.e., "backpack capacity"
memory_compact_ratio 0.75 Threshold ratio for triggering compaction, triggers when max_input_length * ratio is reached
memory_reserve_ratio 0.1 Ratio of recent messages to keep during compaction, keeps max_input_length * ratio tokens
Tool Compaction Parameter Default Description
enable_tool_result_compact false Whether to compress long tool outputs
tool_result_compact_keep_n 5 Number of recent N tool results to keep when compressing

Calculation Relationships:

  • memory_compact_threshold = max_input_length * memory_compact_ratio (threshold for triggering compaction)
  • memory_compact_reserve = max_input_length * memory_reserve_ratio (tokens of recent messages to keep)

Example Configuration:

{
  "agents": {
    "running": {
      "max_input_length": 128000,
      "memory_compact_ratio": 0.7,
      "memory_reserve_ratio": 0.1,
      "enable_tool_result_compact": true,
      "tool_result_compact_keep_n": 3
    }
  }
}