Skip to content

[Feature request] Never automatically unload models #749

@nomonkeynodeal

Description

@nomonkeynodeal

The basic function of llama-swap is to swap models, but it's more than that now. It provides a really nice web UI to load and unload without copy-pasting commands. Also a very under-appreciated pre-made docker, better than Ollama. I have not found anything with comparable functionality.

The automatic unloading every single time I accidentally make a call to the wrong model is major pain point. Loading a model on Strix Halo can take 1-2 minutes.

Getting a helper model to simply load at the same time as another model is always a fight. The groups are borderline impossible to understand. (I took one look at the Matrix letter jumbles and my brain rejected the entire scheme. I know from experience this is never going to happen.)

The two configs in one file paradigm (a second list of models in a group config) is very difficult to manage. I am changing the models that I use every day. It's hard enough getting each individual entry right without managing a second list, scrolling up and down, manually syncing model names.

Maybe there should just be a simple way to stop the constant unloading of models and allow simple manual operation?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions