I was trying to fine-tune flux model with the sd3-flux.1 branch, by adding KOHYA_REF=sd3-flux.1 env variable.
Although the container starts, training fails immediately with the following error: Could not load library libnvrtc.so.12. Error: libnvrtc.so.12: cannot open shared object file: No such file or directory
I checked /usr/lib/x86_64-linux-gnu directory, where libnvrtc.so and libnvrtc.so.12 files are missing for some reason.
Then I tried to mount volume x86_64-linux-gnu from the host, by changing my docker-compose file:
volumes:
- aidock_workspace_dev:/workspace
- /usr/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu <- added this line
Fine tuning starts working, but during startup and training it still shows some errors:
1.
ERROR: ld.so: object 'libtcmalloc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
This one, happens on startup and multiple times later.
2.
E0000 00:00:1731487428.843877 2536 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1731487428.847053 2536 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
this one happens when fine-tuning starts
Seems my solution by mapping /usr/lib/x86_64-linux-gnu is not correct and there has to be another one to fix the original error: Could not load library libnvrtc.so.12. Error: libnvrtc.so.12: cannot open shared object file: No such file or directory.
Any ideas, how to make it work?
I was trying to fine-tune
fluxmodel with thesd3-flux.1branch, by addingKOHYA_REF=sd3-flux.1env variable.Although the container starts, training fails immediately with the following error:
Could not load library libnvrtc.so.12. Error: libnvrtc.so.12: cannot open shared object file: No such file or directoryI checked
/usr/lib/x86_64-linux-gnudirectory, wherelibnvrtc.soandlibnvrtc.so.12files are missing for some reason.Then I tried to mount volume
x86_64-linux-gnufrom the host, by changing my docker-compose file:Fine tuning starts working, but during startup and training it still shows some errors:
1.
ERROR: ld.so: object 'libtcmalloc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.This one, happens on startup and multiple times later.
2.
this one happens when fine-tuning starts
Seems my solution by mapping
/usr/lib/x86_64-linux-gnuis not correct and there has to be another one to fix the original error:Could not load library libnvrtc.so.12. Error: libnvrtc.so.12: cannot open shared object file: No such file or directory.Any ideas, how to make it work?