Skip to content

feat(security): add max_context_messages to prevent retry amplificati…#2074

Open
Waqar53 wants to merge 1 commit into567-labs:mainfrom
Waqar53:feat/retry-context-limits
Open

feat(security): add max_context_messages to prevent retry amplificati…#2074
Waqar53 wants to merge 1 commit into567-labs:mainfrom
Waqar53:feat/retry-context-limits

Conversation

@Waqar53
Copy link
Contributor

@Waqar53 Waqar53 commented Feb 8, 2026

…on (#2056)

This PR addresses the retry amplification security issue described in #2056.

Problem

During retries, each failed attempt appends 2 messages (assistant response + tool error). With 10 retries, this can cause 506x context growth, leading to:

  • Token budget exhaustion
  • Cost amplification
  • Potential denial of service

Solution

Added parameter that limits the number of messages kept in context during retries. The system message is always preserved.

Usage

client = instructor.from_openai(OpenAI())

response = client.chat.completions.create(
    model='gpt-4',
    response_model=MyModel,
    max_retries=10,
    max_context_messages=20,  # Limit context to 20 messages
    messages=[...]
)

Changes

  • Added truncate_context_messages() function in instructor/core/retry.py
  • Added max_context_messages parameter to retry_sync() and retry_async()
  • Exposed parameter through patch.py for user access
  • Added comprehensive unit tests (10 test cases)

Testing

  • All tests pass
  • Backwards compatible (default is None = no truncation)

…on (567-labs#2056)

This PR addresses the retry amplification security issue described in 567-labs#2056.

## Problem
During retries, each failed attempt appends 2 messages (assistant response +
tool error). With 10 retries, this can cause 506x context growth, leading to:
- Token budget exhaustion
- Cost amplification
- Potential denial of service

## Solution
Added  parameter that limits the number of messages
kept in context during retries. The system message is always preserved.

### Usage
```python
client = instructor.from_openai(OpenAI())

response = client.chat.completions.create(
    model='gpt-4',
    response_model=MyModel,
    max_retries=10,
    max_context_messages=20,  # Limit context to 20 messages
    messages=[...]
)
```

### Changes
- Added `truncate_context_messages()` function in `instructor/core/retry.py`
- Added `max_context_messages` parameter to `retry_sync()` and `retry_async()`
- Exposed parameter through `patch.py` for user access
- Added comprehensive unit tests (10 test cases)

### Testing
- All tests pass
- Backwards compatible (default is None = no truncation)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant