Replies: 2 comments
-
|
I installed the 22.04 version of ubuntu and ran into the same is with the Gpu being detected and but not beig check off in the list, however I installed a small AI model and used the watch -n 1 nvidia-smi in terminal and started a conversation with the AI and watched the gpu resources shoot up to the 90's so it is obviously working... I guess the installer has some bugs still. Hope this helps other seeing the same issue. |
Beta Was this translation helpful? Give feedback.
-
|
I have dealt with Ubuntu/Debian driver issues for the last few years, I would be more than happy to help or at least offer suggestions. One of the issues I had when I installed Nomad on my secondary AI machine, when the install script ran the docker portion forced an upgrade of the drivers. This caused my cards to not show up at all anymore under nvidia-smi because it was not in tune with kernel. The drivers installed were not the ones that supported on my systems setup, which requires Cuda Compute 8.6 and 12.0 (RTX 3090 & RTX 5080). For your 3060 you have to make sure 8.6 is supported by the driver (570 or 580, proprietary version). 590 may work. Also, verify the version of Cuda installed under nvidia-smi, is it 12.x or 13.x? The kernel out of whack was the second problem after the install script installed the new Cuda drivers. This required updating the headers and then rebuilding the kernel (this also worked on Debian 13 for my main AI PC for dealing with a mixture of blackwell and ampere architecture cards). I AM NOT AN EXPERT, PLEASE RESEARCH THIS ONE BEFORE YOU TRY IT OR MAKE SURE YOU KNOW HOW TO GET BACK TO A PREVIOUS KERNEL IN THE GRUB MENU! Just my caveat and disclaimer lol, if it is one thing I have learned is that nothing in life or tech "just works" like people say, especially when its free ;-). In the Terminal try these - These just look for data, they don't change anything:
What you are looking for is to see if you booted into a new kernel but the new nvidia modules didn't load. Compare the version of the kernel from 1. above to what is listed in number 2 to see if they are different. Then try 2, this will attempt to modify and install but will most likely error (this is where most of the AI's end up going off base if it errors):
THE FIX FOR ME
I am not a fan of docker and rarely use it but I understand why it is used for a lot of rollouts and end users. But the docker part is what ended up impacting me, I thought I did everything I could before hand to pin the drivers and toolkit so the script would succeed and skip the nvidia install but I must have missed something. Please feel free to reach out, like I said, I have been wrestling with linux, nvidia and all of this for a while now and when it works its great, just needs a little duct tape now and then ;-). -Kreed |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
OS Ubuntu 24.04 LTS fresh install
System:
Ryzen 5 4600G
RTX 3060 12GB
Asrock B550 Riptide
16GB Memory
1 TB NVME SSD
After the OS install I ran the obligatory sudo apt Update and Upgrade command
Ran the command to install nomad from github - after this the fun began
got some error about 'curl' not being a recognized - so i installed that
then GPU not being detected showed up, at this point enlisted Gemini Pro.... and after that it was a hour of - Try this.... nope didn't work, now lets try this....nope didn't work, a lot of back and forth try something, paste the error- rinse repeat.
At this point I am considering either and older version of Ubuntu or debian
Anybody else hit these snags?
Beta Was this translation helpful? Give feedback.
All reactions