feat(llm/anthropic): drop top_p when both temperature and top_p set (closes #100) by lfnovo · Pull Request #159 · lfnovo/esperanto

lfnovo · 2026-05-02T03:13:40Z

Summary

Anthropic models reject API requests where both `temperature` and `top_p` are sent simultaneously (HTTP 400). This applies to all Anthropic models — version-string sniffing isn't needed; Anthropic's own docs discourage setting both.

This PR sanitizes the request in the Anthropic provider's request-builder: when both are set on the instance, `top_p` is dropped (keeping `temperature`, the more common knob), and a DEBUG log is emitted.

```python
if temperature is not None and top_p is not None:
logger.debug("Dropping top_p — Anthropic recommends setting only temperature OR top_p, not both.")
payload["temperature"] = temperature
elif temperature is not None:
payload["temperature"] = temperature
elif top_p is not None:
payload["top_p"] = top_p
```

This is the canonical implementation of the Model Quirks vs Unsupported Features principle from ARCHITECTURE.md (committed in #150): users should not need to know which model has which quirk.

Diff shape

```
src/esperanto/providers/llm/anthropic.py | 7 +++++++
tests/providers/llm/test_anthropic_provider.py | 23 +++++++++++++++++++++++
2 files changed, 30 insertions(+)
```

Test plan

New `test_temperature_drops_top_p_with_debug_log`: sets both, asserts outgoing payload has `temperature` but NOT `top_p`, asserts DEBUG log is emitted with both 'top_p' and 'Anthropic' in message
Pre-existing `test_chat_complete` (only `temperature` set): unchanged behavior, no log
Pre-existing `test_top_p_used_when_temperature_not_set` (only `top_p` set): unchanged behavior, top_p passes through
All 45 Anthropic tests pass
`uv run pytest tests/providers tests/unit tests/common_types tests/test_deprecation_warnings.py -q --no-cov` exits 0 (945 passed)
`uv run ruff check .` clean
`uv run mypy src/esperanto` clean

References

Reported in Open Notebook: [Bug]: temperature and top_p cannot both be specified for Claude 4.x models (Opus 4.5, Opus 4.6, etc.) open-notebook#699
Same issue in n8n: AI Agent with Anthropic models fails: 'temperature and top_p cannot both be specified' error n8n-io/n8n#18304
LiteLLM tracking: [Bug]: claude-sonnet-4-5-20250929 "temperature and top_p` cannot both be specified for this model." for anthropic passthrough BerriAI/litellm#15097

Closes #100.

…e in request payload task=t1 validator: DIFF: 2 files changed, 30 insertions(+) — src/esperanto/providers/llm/anthropic.py (+7: import logging, logger = logging.getLogger(__name__), nested if/logger.debug block in _create_request_payload) and tests/providers/llm/test_anthropic_provider.py (+23: new test_temperature_drops_top_p_with_debug_log).; AC1: pass — test_temperature_drops_top_p_with_debug_log sets anthropic_model.top_p=0.95 (fixture has temperature=0.7), calls chat_complete, then asserts call_args[1]['json'] contains 'temperature' and NOT 'top_p'. Test passed (45/45 Anthropic suite).; AC2: pass — Same test uses caplog.at_level(DEBUG, logger='esperanto.providers.llm.anthropic') and asserts any record with levelno==DEBUG has both 'top_p' and 'Anthropic' in message. Logged string is 'Dropping top_p — Anthropic recommends setting only temperature OR top_p, not both.' — contains both required words. Test passed.; AC3: pass — Pre-existing test_chat_complete (fixture: temperature=0.7, no top_p) passed; payload asserts temperature==0.7 present, no top_p key. The new code path (temperature not None, top_p is None) reaches payload['temperature']=... without calling logger.debug. No regression.; AC4: pass — Pre-existing test_top_p_used_when_temperature_not_set passed: model with temperature=None and top_p=0.95, _create_request_payload called, asserts 'top_p' in payload, payload['top_p']==0.95, 'temperature' not in payload. The elif branch is correct.; AC5: pass — test_temperature_drops_top_p_with_debug_log exists at line 408 of tests/providers/llm/test_anthropic_provider.py, uses the anthropic_model fixture (mocked HTTP client), sets top_p on instance, calls chat_complete, inspects call_args[1]['json'] for payload shape, checks caplog for DEBUG record. Passed.; AC6: pass — All 45 pre-existing Anthropic tests passed in uv run pytest tests/providers/llm/test_anthropic_provider.py -v --no-cov.; AC7: pass — uv run pytest tests/providers tests/unit tests/common_types tests/test_deprecation_warnings.py -q --no-cov exited 0. Result: 945 passed, 1 skipped, 28 warnings. Previous run had a HuggingFace Hub HTTP 429 rate-limit error in test_qwen_model_configuration (pre-existing external flakiness); re-run cleared it.; AC8: pass — uv run ruff check . output: 'All checks passed!' exit 0. No new ruff errors.; AC9: pass — uv run mypy src/esperanto output: 'Success: no issues found in 73 source files' exit 0. No new mypy errors.

cubic-dev-ai

No issues found across 2 files

Confidence score: 5/5

Automated review surfaced no issues in the provided summaries.
No files require special attention.

cubic-dev-ai Bot reviewed May 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(llm/anthropic): drop top_p when both temperature and top_p set (closes #100)#159

feat(llm/anthropic): drop top_p when both temperature and top_p set (closes #100)#159
lfnovo wants to merge 1 commit intomainfrom
harny/issue-100

lfnovo commented May 2, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lfnovo commented May 2, 2026

Summary

Diff shape

Test plan

References

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant