Conversation
ReviewArchitecture / Unintended ConsequencesAt a high level this PR is well-scoped. The public type, schema, conversion logic, and regression tests all move together. The only subtle area is the inferred I do not see broader architectural problems or unintended cross-package consequences in the actual FindingsNo blocking findings in the corrected PR diff. Residual RiskThe only minor gap I see is that the tests are centered on partial-config behavior. There is no targeted test for the |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5e639b1c19
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| * where 0 means no work is distributed and 1 means all work is distributed. Can also specify the | ||
| * string "off" to mean 0 and the string "max" to mean 1. | ||
| * | ||
| * When `fit` is true, this field is ignored. |
There was a problem hiding this comment.
Debate: should this be ignored, or should we throw?
There was a problem hiding this comment.
I do share the concern of this type of implicit ignorance of parameters/relationship between parameters here, and wonder if there's a better way to represent the relationship (maybe not)
There was a problem hiding this comment.
Potential language enforced mechanism for this general type of thing: Discriminated union where you can't set the other params if you set fit. Haven't thought through what the backwards compat impact of doing this here would be, but the relationship itself does seem discriminatory?
There was a problem hiding this comment.
In order to be backwards compatible, I think we have to ignore.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 94c807417c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1222ccd6ff
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } | ||
| `; | ||
|
|
||
| exports[`LLM with default model fixture Can tokenize correctly 1`] = ` |
There was a problem hiding this comment.
Remove stale snapshot keys after renaming heavy tests
The snapshot file now contains both the old LLM ... keys and the new LLM with default model fixture ... keys, but LLM.heavy.test.ts only defines the renamed tests (see the describe("with default model fixture") block), so running this heavy test file will report obsolete snapshots and fail until they are cleaned up. Keeping both sets of keys makes the snapshot suite unstable for anyone running heavy tests.
Useful? React with 👍 / 👎.
| "identifier": Any<String>, | ||
| "indexedModelIdentifier": "lmstudio-community/Qwen2.5-0.5B-Instruct-GGUF/Qwen2.5-0.5B-Instruct-Q4_K_M.gguf", | ||
| "instanceReference": Any<String>, | ||
| "lastUsedTime": 1773374807159, |
There was a problem hiding this comment.
Stop snapshotting volatile last-used timestamps
This snapshot pins lastUsedTime to a concrete Unix timestamp, but lastUsedTime is inherently time-dependent model metadata, so future heavy-test runs will drift and fail even when behavior is correct. Add a property matcher (e.g. expect.any(Number)/null) for this field instead of hardcoding a specific value.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 45725f5624
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| config.gpu?.ratio !== undefined | ||
| || config.gpu?.numCpuExpertLayersRatio !== undefined | ||
| || config.gpu?.mainGpu !== undefined | ||
| || config.gpu?.splitStrategy !== undefined; |
There was a problem hiding this comment.
Exclude default splitStrategy from fit inference
Inferring fit=false from any defined gpu.splitStrategy causes a regression when the strategy was auto-populated as the default ("evenly") rather than explicitly chosen by the caller. Fresh evidence: starting from { gpu: { disabledGpus: [1] } }, llmLoadModelConfigToKVConfig writes a split config, kvConfigToLLMLoadModelConfig reconstructs gpu.splitStrategy: "evenly", and this condition then forces llama.fit=false on the next serialization, which unexpectedly overrides higher-priority defaults where fit should remain enabled (disabled GPUs are documented as fit-compatible).
Useful? React with 👍 / 👎.
Add
llama.acceleration.fitto the KV Config schema. This is a new boolean that will enable / disable llama.cpp's-fitmodel loading logic.When enabled, the other GPU parameters will be ignored.
To maintain compatibility with existing SDK usage, I set the default value in schema to
false. This allows a call such asload({gpu: {ratio: 0.5}})to still work as originally intended. I plan to default the standard GUI and CLI paths to true though via explicit overrides in those locations.