-
Notifications
You must be signed in to change notification settings - Fork 50
Move more normalization into the model-runner backend #335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @ericcurtin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves an issue within the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request fixes an issue where model IDs (like sha256 hashes) were being incorrectly normalized when using model-cli rm. The changes in cmd/cli/commands/rm.go and cmd/cli/desktop/desktop.go correctly differentiate between model names and model IDs, ensuring that IDs are not normalized. The logic is sound. I've added one suggestion to refactor a part of the code in desktop.go to improve readability and reduce duplication.
c2d5b36 to
90163b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was testing this PR and I faced a similar issue when packaging a model:
MODEL_RUNNER_HOST=http://localhost:13434 docker model ls
MODEL NAME PARAMETERS QUANTIZATION ARCHITECTURE MODEL ID CREATED CONTEXT SIZE
MODEL_RUNNER_HOST=http://localhost:13434 docker model pull ai/smollm2:latest
Downloaded 270.60MB of 270.60MB
Model pulled successfully
MODEL_RUNNER_HOST=http://localhost:13434 docker model ls
MODEL NAME PARAMETERS QUANTIZATION ARCHITECTURE MODEL ID CREATED CONTEXT SIZE
smollm2 361.82 M IQ2_XXS/Q4_K_M llama 354bf30d0aa3 7 months ago - 256.35 MiB
MODEL_RUNNER_HOST=http://localhost:13434 docker model package --from ai/smollm2 --context-size 2048 smollm2:2k
INFO[0000] Successfully initialized store
Reading model from store: "ai/smollm2"
INFO[0000] Getting model by reference: ai/smollm2
Creating builder from existing model
Setting context size 2048
Creating lightweight model variant...
INFO[0000] Writing lightweight model variant
Model variant created successfully
Loading model to Model Runner...
Transferred: 0.01 MB
Failed to package model
package model: failed to load packaged model: tag model: tagging failed with status 404 Not Found: tagging model: model not found
MODEL_RUNNER_HOST=http://localhost:13434 docker model ls
MODEL NAME PARAMETERS QUANTIZATION ARCHITECTURE MODEL ID CREATED CONTEXT SIZE
smollm2 361.82 M IQ2_XXS/Q4_K_M llama 354bf30d0aa3 7 months ago - 256.35 MiB
smollm2:2k 361.82 M IQ2_XXS/Q4_K_M llama 7522a58ecdbd 7 months ago 2048 256.35 MiB
At this point the store is corrupted and I can not remove the model:
MODEL_RUNNER_HOST=http://localhost:13434 docker model rm smollm2:2k
no such model: ai/smollm2:2k
Failed to remove model
MODEL_RUNNER_HOST=http://localhost:13434 docker model ls
MODEL NAME PARAMETERS QUANTIZATION ARCHITECTURE MODEL ID CREATED CONTEXT SIZE
smollm2:2k 361.82 M IQ2_XXS/Q4_K_M llama 7522a58ecdbd 7 months ago 2048 256.35 MiB
MODEL_RUNNER_HOST=http://localhost:13434 docker model rm 7522a58ecdbd
no such model: sha256:7522a58ecdbda6cdeea5480bebd456329e65fc4bec7cd9484193a2813d4b3413
Failed to remove model
90163b4 to
61e5f4e
Compare
|
@ilopezluna ready for re-review |
61e5f4e to
8e78243
Compare
|
9832fe2 to
e4338c3
Compare
|
The And something's odd with We lack tests for these. |
|
inspect seems to work with ai/ , it's a case of not enough normalisation there, the other needs to be investigated |
|
Tomorrows problem at this point |
e4338c3 to
c958a3f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New security issues found
c958a3f to
ea07b4a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New security issues found
ea07b4a to
aaf52a6
Compare
|
@sourcery-ai please re-review |
|
Sure! I'm generating a new review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
Blocking issues:
- Detected directly writing or similar in 'http.ResponseWriter.write()'. This bypasses HTML escaping that prevents cross-site scripting vulnerabilities. Instead, use the 'html/template' package and render data using 'template.Execute()'. (link)
- Detected 'printf' or similar in 'http.ResponseWriter.write()'. This bypasses HTML escaping that prevents cross-site scripting vulnerabilities. Instead, use the 'html/template' package to render data to users. (link)
General comments:
- The partial name matching logic is duplicated in multiple handlers—extract it into a shared helper to avoid inconsistencies and reduce code churn.
- handlePackageModel only supports lightweight, config‐only repackaging; consider adding support for full rebuilds or explicitly documenting why non-config changes aren’t allowed.
- Now that the CLI stops auto-adding a ":latest" tag, verify that the backend still applies default tags for non-HF models and adjust tests or documentation to reflect this behavior.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The partial name matching logic is duplicated in multiple handlers—extract it into a shared helper to avoid inconsistencies and reduce code churn.
- handlePackageModel only supports lightweight, config‐only repackaging; consider adding support for full rebuilds or explicitly documenting why non-config changes aren’t allowed.
- Now that the CLI stops auto-adding a ":latest" tag, verify that the backend still applies default tags for non-HF models and adjust tests or documentation to reflect this behavior.
## Individual Comments
### Comment 1
<location> `pkg/inference/models/manager.go:727` </location>
<code_context>
w.Write([]byte(fmt.Sprintf("Model tagged successfully with %q", html.EscapeString(target))))
</code_context>
<issue_to_address>
**security (go.lang.security.audit.xss.no-direct-write-to-responsewriter):** Detected directly writing or similar in 'http.ResponseWriter.write()'. This bypasses HTML escaping that prevents cross-site scripting vulnerabilities. Instead, use the 'html/template' package and render data using 'template.Execute()'.
*Source: opengrep*
</issue_to_address>
### Comment 2
<location> `pkg/inference/models/manager.go:727` </location>
<code_context>
w.Write([]byte(fmt.Sprintf("Model tagged successfully with %q", html.EscapeString(target))))
</code_context>
<issue_to_address>
**security (go.lang.security.audit.xss.no-printf-in-responsewriter):** Detected 'printf' or similar in 'http.ResponseWriter.write()'. This bypasses HTML escaping that prevents cross-site scripting vulnerabilities. Instead, use the 'html/template' package to render data to users.
*Source: opengrep*
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Hey @ericcurtin, I've posted a new review for you! |
model-runner daemon is more well equipped to identify between model name and model id. Signed-off-by: Eric Curtin <[email protected]>
aaf52a6 to
dbc3cbb
Compare
|
@doringeman @ilopezluna PTAL |
|
@sourcery-ai please re-review |
|
Sure! I'm generating a new review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- Consolidate the partial name matching and ID fallback logic into a shared resolver to reduce duplication between handlers and simplify maintenance.
- Listing all models on every request for name/ID resolution can be expensive—consider caching the model list or delegating partial-match filtering into the distribution client.
- handlePackageModel only supports config-only (lightweight) repackaging—either extend support for full rebuilds or explicitly error out/document unsupported layer changes.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consolidate the partial name matching and ID fallback logic into a shared resolver to reduce duplication between handlers and simplify maintenance.
- Listing all models on every request for name/ID resolution can be expensive—consider caching the model list or delegating partial-match filtering into the distribution client.
- handlePackageModel only supports config-only (lightweight) repackaging—either extend support for full rebuilds or explicitly error out/document unsupported layer changes.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Hey @ericcurtin, I've posted a new review for you! |
| // Extract the model name without tag part (e.g., from "ai/smollm2:latest" get "ai/smollm2") | ||
| tagWithoutVersion := tag | ||
| if idx := strings.LastIndex(tag, ":"); idx != -1 { | ||
| tagWithoutVersion = tag[:idx] | ||
| } | ||
|
|
||
| // Get just the name part without organization (e.g., from "ai/smollm2" get "smollm2") | ||
| namePart := tagWithoutVersion | ||
| if idx := strings.LastIndex(tagWithoutVersion, "/"); idx != -1 { | ||
| namePart = tagWithoutVersion[idx+1:] | ||
| } | ||
|
|
||
| // Check if the ID matches the name part | ||
| if namePart == id { | ||
| return m.ID, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use name.ParseReference(tag) to get same results:
| // Extract the model name without tag part (e.g., from "ai/smollm2:latest" get "ai/smollm2") | |
| tagWithoutVersion := tag | |
| if idx := strings.LastIndex(tag, ":"); idx != -1 { | |
| tagWithoutVersion = tag[:idx] | |
| } | |
| // Get just the name part without organization (e.g., from "ai/smollm2" get "smollm2") | |
| namePart := tagWithoutVersion | |
| if idx := strings.LastIndex(tagWithoutVersion, "/"); idx != -1 { | |
| namePart = tagWithoutVersion[idx+1:] | |
| } | |
| // Check if the ID matches the name part | |
| if namePart == id { | |
| return m.ID, nil | |
| } | |
| // Try to parse both the stored tag and the input ID | |
| tagRef, err1 := name.ParseReference(tag) | |
| idRef, err2 := name.ParseReference(id) | |
| if err1 != nil || err2 != nil { | |
| // fallback to string comparison if parsing fails | |
| if tag == id { | |
| return m.ID, nil | |
| } | |
| continue | |
| } | |
| // Compare normalized names (without tag or digest) | |
| if tagRef.Context().Name() == idRef.Context().Name() { | |
| return m.ID, nil | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think this change also fixes the case were name and tag are specified but not the repository:
MODEL_RUNNER_HOST=http://localhost:13434 docker model inspect ai/smollm2:latest
{
"id": "sha256:354bf30d0aa3af413d2aa5ae4f23c66d78980072d1e07a5b0d776e9606a2f0b9",
"tags": [
"ai/smollm2:latest"
],
"created": 1742816981,
"config": {
"format": "gguf",
"quantization": "IQ2_XXS/Q4_K_M",
"parameters": "361.82 M",
"architecture": "llama",
"size": "256.35 MiB"
}
}
MODEL_RUNNER_HOST=http://localhost:13434 docker model inspect smollm2:latest
Failed to get model smollm2:latest: invalid model name: smollm2:latest
```
| } else { | ||
| apiModel, err = getLocalModel(m, modelName) | ||
| // For local lookups, first try without normalization (as ID), then with normalization | ||
| apiModel, err = getLocalModel(m, modelRef) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding these extra steps to handle different model retrieval cases introduces additional complexity and some performance impact (even if minimal).
In my opinion, the benefit of not having to specify the namespace does not justify these extra steps across all commands.
I understand that the feature is already implemented, so I’m fine with keeping it as it is for now. However, we should aim to improve this feature eventually.
model-runner daemon is more well equipped to identify between
model name and model id.