Skip to content

feat: configure default max_tokens for anthropic translator#1933

Open
herewasmike wants to merge 10 commits intoenvoyproxy:mainfrom
herewasmike:openai_awsanthropic_translation
Open

feat: configure default max_tokens for anthropic translator#1933
herewasmike wants to merge 10 commits intoenvoyproxy:mainfrom
herewasmike:openai_awsanthropic_translation

Conversation

@herewasmike
Copy link
Copy Markdown
Contributor

Description

max_tokens and max_completion_tokens are optional in openAI spec
https://developers.openai.com/api/reference/resources/chat/subresources/completions/methods/create

However when I try to send the request from the source I get 422 if my AIServiceBackend has AWSAnthropic schema.
It's impossible to add this field with bodyMutation due to ordering (I've considered mutating body before the translation but it seems to be a whole redesign rather than small fix with providing default)

Special notes for reviewers (if applicable)

Per gen AI policy I disclose that claude did help me with setting up this PR

@herewasmike herewasmike requested a review from a team as a code owner March 9, 2026 22:56
@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Mar 9, 2026
Signed-off-by: Mikhail Toldov <matoldov@gmail.com>
@herewasmike herewasmike force-pushed the openai_awsanthropic_translation branch from 0247bb7 to dbd525e Compare March 9, 2026 22:56
@dosubot
Copy link
Copy Markdown

dosubot bot commented Mar 9, 2026

Related Documentation

3 document(s) may need updating based on files changed in this PR:

Envoy's Space

gcp-vertexai /ai-gateway/blob/main/site/docs/getting-started/connect-providers/gcp-vertexai.md
View Suggested Changes
@@ -104,6 +104,10 @@
   $GATEWAY_URL/v1/chat/completions
 ```
 
+:::note
+The `max_completion_tokens` parameter (or `max_tokens`) is optional and defaults to 4096 if not specified. The example above includes it to demonstrate setting an explicit limit.
+:::
+
 Expected output:
 
 ```json

[Accept] [Decline]

gcp-vertexai /ai-gateway/blob/main/site/versioned_docs/version-0.4/getting-started/connect-providers/gcp-vertexai.md
View Suggested Changes
@@ -101,6 +101,10 @@
   $GATEWAY_URL/v1/chat/completions
 ```
 
+:::note
+The `max_completion_tokens` parameter is optional and defaults to 4096 if not specified. It's recommended to set it explicitly to control response length and costs.
+:::
+
 Expected output:
 
 ```json
@@ -136,6 +140,10 @@
   }' \
   $GATEWAY_URL/anthropic/v1/messages
 ```
+
+:::note
+The `max_tokens` parameter is optional and defaults to 4096 if not specified. It's recommended to set it explicitly to control response length and costs.
+:::
 
 ## Troubleshooting
 

[Accept] [Decline]

gcp-vertexai /ai-gateway/blob/main/site/versioned_docs/version-0.5/getting-started/connect-providers/gcp-vertexai.md
View Suggested Changes
@@ -104,6 +104,10 @@
   $GATEWAY_URL/v1/chat/completions
 ```
 
+:::note
+The `max_completion_tokens` parameter is optional. If not specified, it defaults to 4096 tokens.
+:::
+
 Expected output:
 
 ```json
@@ -139,6 +143,10 @@
   }' \
   $GATEWAY_URL/anthropic/v1/messages
 ```
+
+:::note
+The `max_tokens` parameter is optional. If not specified, it defaults to 4096 tokens.
+:::
 
 ## Troubleshooting
 

[Accept] [Decline]

Note: You must be authenticated to accept/decline updates.

How did I do? Any feedback?  Join Discord

Signed-off-by: Mikhail Toldov <matoldov@gmail.com>
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Mar 9, 2026
@herewasmike
Copy link
Copy Markdown
Contributor Author

I'm not sure if arbitrary default value + bodyMutator is the best approach.
Though, no idea how anthropic api reacts if supplied with max_tokens value greater than the model supports.
Happy to take suggestions here

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.33%. Comparing base (2d35d43) to head (ef7a290).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1933      +/-   ##
==========================================
- Coverage   84.33%   84.33%   -0.01%     
==========================================
  Files         130      130              
  Lines       17987    17986       -1     
==========================================
- Hits        15170    15169       -1     
  Misses       1873     1873              
  Partials      944      944              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@nutanix-Hrushikesh
Copy link
Copy Markdown
Contributor

Though, no idea how anthropic api reacts if supplied with max_tokens value greater than the model supports.

it returns 400 Bad Request

@johnugeorge
Copy link
Copy Markdown
Contributor

This might be a bit confusing if low default is observed by default. Isn't it better for the client to be aware of and use the right values? Are you using a client where it is not configurable?

@herewasmike
Copy link
Copy Markdown
Contributor Author

Yes, I'm connecting 3rd party code in this particular case so it's not easy to change the client's behavior.
I'll rework this PR following up the conversation on slack, though, to avoid the confusion

Simply allow requests to pass through and fail on the provider side (one would be able to detect it and mutate request body)

@herewasmike herewasmike force-pushed the openai_awsanthropic_translation branch from 284e5be to b5efeb0 Compare March 18, 2026 23:02
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Mar 18, 2026
@herewasmike herewasmike force-pushed the openai_awsanthropic_translation branch from b5efeb0 to 284e5be Compare March 18, 2026 23:04
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Mar 18, 2026
Signed-off-by: Mikhail Toldov <matoldov@gmail.com>
Signed-off-by: Mikhail Toldov <matoldov@gmail.com>
@herewasmike herewasmike force-pushed the openai_awsanthropic_translation branch from 284e5be to 2810ba3 Compare March 18, 2026 23:06
@herewasmike
Copy link
Copy Markdown
Contributor Author

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants