feat: allow customizing context window and max output tokens#1189
feat: allow customizing context window and max output tokens#11893kin0x wants to merge 9 commits into
Conversation
BlockersNone found. Non-Blocking
Looks Good
Verdict: Approve — clean feature addition for provider customization. |
Vasanthdev2004
left a comment
There was a problem hiding this comment.
Clean feature addition for provider customization.
|
Thank you very much for the feedback |
|
Indeed my first considération using devstral-small-2 reaching 28% of used context I systematically get "compacting conversation" thus I thought this evolution could help. |
|
@Vasanthdev2004 thanks :) Regarding the scope of these parameters: applying them to all OpenAI-compatible routes provides immediate flexibility for the majority of custom and local use cases (like Ollama or specialized proxies). The current architecture is designed so that we can easily add provider-specific overrides (e.g., ANTHROPIC_CONTEXT_WINDOW_SIZE) in the future if needed, without breaking the existing profile structure or the persistence layer. |
gnanam1990
left a comment
There was a problem hiding this comment.
Thanks for this — per-profile context window / max output tokens is a genuinely useful capability, especially for local models, and the persistence test coverage is solid. A few things to address before it can merge:
- Unrelated files / scope creep.
src/services/api/utils.ts(168 lines) andsrc/utils/ripgrep/errors.tsare brand-new files that don't exist onmainand aren't imported anywhere in this PR — they look like they were pulled in from another branch. Could you drop them so the diff is just the provider-profile feature? Same goes for theknowledgeGraph.stress.test.tsCI-skip, which disables an unrelated regression test. - CI needs to be green. You mentioned CI keeps failing — once the unrelated files/test changes are removed, the failures should narrow to something tractable. Happy to help dig in once it's scoped down.
- Env-key scoping. In
buildStartupProfileFromActiveProfile,OPENAI_CONTEXT_WINDOW_SIZE/OPENAI_MAX_OUTPUT_TOKENSget written for every compatibility mode (anthropic/gemini/mistral/etc.), butcontext.tsonly reads theOPENAI_*names. Could you either gate the write to OpenAI-compatible routes or use a provider-neutral key so the behavior matches the documented intent?
Once it's narrowed to the feature with green CI, this should be a quick second pass. Thanks for the contribution — the core idea is good.
…or to neutral env keys
35ca373 to
4153036
Compare
|
Thanks @gnanam1990 for the detailed feedback! I've updated the PR to address all points:
Ready for a second pass! |
jatmn
left a comment
There was a problem hiding this comment.
Findings
-
[P2] Restore persisted token limits during legacy profile launch
src/utils/providerProfile.ts:1474
The active-profile path writesCLAUDE_CODE_MAX_CONTEXT_TOKENSandCLAUDE_CODE_MAX_OUTPUT_TOKENSinto.openclaude-profile.json, butbuildStartupEnvFromProfile/buildLaunchEnvrebuilds the launch env from the persisted file without carrying those two env values forward. I verified this with a persisted OpenAI profile containing both keys: the startup env keptOPENAI_BASE_URLandOPENAI_MODEL, but both token-limit env vars came backundefined. That means the documented startup fallback silently loses the new settings in the legacy profile-file path, and the new persistence test only checks that the file is written, not that it is read back into the launch env. Please copy the persisted token-limit keys into the rebuilt env for all applicable profile modes, and add a test thatbuildStartupEnvFromProfilepreserves them. -
[P2] Remove the unrelated KnowledgeGraph CI skip
src/utils/knowledgeGraph.stress.test.ts:111
The PR still changesknowledgeGraph.stress.test.tsand skips the corrupted-Orama recovery test on CI. That was called out in the earlier review as unrelated scope, and the author follow-up says it was restored, but the diff still disables this regression coverage. Since this PR is about provider profile token limits, please drop the KnowledgeGraph test change or move it to a separate PR with its own justification.
gnanam1990
left a comment
There was a problem hiding this comment.
Thanks for the quick turnaround — the env-key scoping is now provider-neutral and applied across all compatibility modes as intended, and dropping the unrelated src/services/api/utils.ts / src/utils/ripgrep/errors.ts changes tidied the diff up nicely. The core feature is in good shape; a few items before it can go green:
- TypeScript compile error (this is almost certainly the failing CI you were seeing). In
src/components/ProviderManager.tsx:1154-1166,startCreateFromPresetbuildsnextDraftas an inline literal that's missing the newcontextWindowSize/maxOutputTokensfields you added toProviderDraft, sosetDraft(nextDraft)(1166) andcanUseStreamlinedPresetFlow(nextDraft)(1184) failtsc --noEmitwith TS2345. You correctly updated thetoDraftandpresetToDrafthelpers — this preset path just needs the same two fields ('').bun run buildwon't catch it since the bundler doesn't type-check, butbun x tsc --noEmitand CI do. - I agree with @jatmn's two points: the legacy
buildLaunchEnvpath doesn't carry the persisted token-limit keys forward (a carry-forward plus a read-back test would cover it), and theknowledgeGraph.stress.test.tsCI-skip is still in the diff despite being out of scope here — please drop it. - One question, not a blocker:
CLAUDE_CODE_MAX_OUTPUT_TOKENSflows throughvalidateBoundedIntEnvVarand is capped at the model's real limit, butCLAUDE_CODE_MAX_CONTEXT_TOKENSingetContextWindowForModel(context.ts:87) has no upper bound and overrides all detection including the runtime caps. The wizard rejects zero/negative but not an unrealistically large value, which would suppress auto-compact and risk provider context-overflow errors. Would it make sense for the context-window value to follow the same bounded-validation pattern as max output tokens? A one-line help-text note that a shell-exportedCLAUDE_CODE_MAX_CONTEXT_TOKENSnow applies to all users (since the previous internal-only gate was removed) would also help.
Once the tsc error is fixed and @jatmn's two scope items are resolved, this should be a quick final pass. Genuinely useful feature for local models — appreciate the contribution.
|
Hello, Thanks @jatmn and @gnanam1990 for the detailed feedback! I have applied the following corrections:
|
Summary
What changed:
Why it changed:
User-facing impact:
Developer/maintainer impact:
Testing
Notes