The default max_tokens value (2048) is hardcoded in the schema. Different deployment scenarios need different defaults — for example, reasoning models work better with higher limits, while cost-sensitive deployments may want a lower default.
RooCode has a setting for Maximum Output Tokens. This RooCodeInc/Roo-Code#4036 suggests it was fixed in Jun 2025 but the headers are not passed through.
Additionally, the OpenAI API now has both max_tokens and max_completion_tokens parameters, with max_completion_tokens being preferred. The current code has duplicate logic for handling these two fields.
Proposed solution
- Add
DEFAULT_MAX_TOKENS environment variable in setting.py (default: 2048 to preserve existing behaviour)
- Use it as the schema default for
max_tokens
- Compute
effective_max_tokens in _parse_request that prefers max_completion_tokens over max_tokens, eliminating duplicate logic:
effective_max_tokens = (
chat_request.max_completion_tokens
if chat_request.max_completion_tokens is not None
else chat_request.max_tokens
)
inference_config = {"maxTokens": effective_max_tokens}
# Example: raise default for reasoning models
export DEFAULT_MAX_TOKENS=16384
Files: src/api/setting.py, src/api/schema.py, src/api/models/bedrock.py
The default
max_tokensvalue (2048) is hardcoded in the schema. Different deployment scenarios need different defaults — for example, reasoning models work better with higher limits, while cost-sensitive deployments may want a lower default.RooCode has a setting for Maximum Output Tokens. This RooCodeInc/Roo-Code#4036 suggests it was fixed in Jun 2025 but the headers are not passed through.
Additionally, the OpenAI API now has both
max_tokensandmax_completion_tokensparameters, withmax_completion_tokensbeing preferred. The current code has duplicate logic for handling these two fields.Proposed solution
DEFAULT_MAX_TOKENSenvironment variable insetting.py(default:2048to preserve existing behaviour)max_tokenseffective_max_tokensin_parse_requestthat prefersmax_completion_tokensovermax_tokens, eliminating duplicate logic:Files:
src/api/setting.py,src/api/schema.py,src/api/models/bedrock.py