Skip to content

Fix: eviction_engine does not run nvidia-smi even on AMD configs anymore and some testing problems fixed.#2331

Open
GabrielReusRodriguez wants to merge 1 commit into
lemonade-sdk:mainfrom
GabrielReusRodriguez:Fix-Fix-server_eviction.py-had-an-inexistent-model-to-pull
Open

Fix: eviction_engine does not run nvidia-smi even on AMD configs anymore and some testing problems fixed.#2331
GabrielReusRodriguez wants to merge 1 commit into
lemonade-sdk:mainfrom
GabrielReusRodriguez:Fix-Fix-server_eviction.py-had-an-inexistent-model-to-pull

Conversation

@GabrielReusRodriguez

Copy link
Copy Markdown

Hi!

I found some bugs on eviction engine.

First, my config is all AMD hardware and when I launched lemonade I saw this line repeated all time:

sh 1: nvidia-smi not found.

looking at code, I saw at system_info.cpp funtion double SystemInfo::get_global_vram_usage_pct() that it was always executing the call to nvidia-smi program so if somebody had amd config, it will not find this program.

image

I fixed making a get_cuda_arch().empty()) call to check if you have some nvidia arch on your config. If so.. it launches the code.

Second, I tryed the server_eviction.py tests. but I found some errors.

  • The Script loads a test model phi-3-mini-4k-instruct-q4 that did not exists on lemonade so it failed. I changed by equivalent Phi-4-mini-instruct-GGUF.
  • The tests usually loaded 2 models but it did not changed the by default config so the server always had max_models_loaded = 1 and failed when the second model was loaded.
  • Finally at def test_weight_factor_protects_model(self): test , i found that it loads the first model (with weight), sleeps 2 seconds and loads the second one. It always failed and unloaded the one with weight. Looking the code and doing some manual testing I found that when it loaded the second model, the first model was about 3 seconds at server and the seconds 3 ms. So the 1000 value of weight did not difference. The solution I thought was do another sleep after the second model load and when I executed i found it passed the test!! it unload the second one .
image

I hope these changes help.

@github-actions github-actions Bot added bug Something isn't working enhancement New feature or request labels Jun 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant