Replies: 7 comments 2 replies
-
|
what is the proper way to run multiple commands in succession? I've tried several config.yaml options: This one appears to concatenate the 2 commands and throws 'curl: option --port: is unknown'. With this approach the previous model get unloaded and the new one get loaded, but the third iteration doesn't work anymore (loading yet another model does not unload the loaded one). |
Beta Was this translation helpful? Give feedback.
-
|
Is llama-swap not able to unload the Tabby model for you? |
Beta Was this translation helpful? Give feedback.
-
|
no, llama-swap doesn't unload TabbyApi: tabby needs a curl POST unload call as in my example above. |
Beta Was this translation helpful? Give feedback.
-
|
I’m confused. Are you starting tabbyAPI with llama-swap? It sounds like you’re running it outside of llama-swap and trying to send a signal to stop it. |
Beta Was this translation helpful? Give feedback.
-
|
we’re talking in two places. I just saw #58 (comment). Can you share you whole llama-swap config? It may shed some light on what’s going on and why tabby can’t be stopped by llama-swap. |
Beta Was this translation helpful? Give feedback.
-
|
Sure, here's the current config.yaml: |
Beta Was this translation helpful? Give feedback.
-
|
yes, i saw those discussions, and it seems that cmdStop-ing a container might work. I will probably have to consider this "nuclear option" if there's no hope of sending a clean unload api call. Is it? :-) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I use TabbyApi for exl and llama-server for gguf. Currently i need to manually unload a Tabby model before i can load a llama-server one.
Is it possible to configure llama-swap to send a "POST /v1/model/unload " request before sending the request to start a new llama-server -m some_model.gguf?
Beta Was this translation helpful? Give feedback.
All reactions