The main issue seems to be that the main CUDA runtime library was not detected

### System Info

CUDA SETUP: Problem: The main issue seems to be that the main CUDA runtime library was not detected.
CUDA SETUP: Solution 1: To solve the issue the libcudart.so location needs to be added to the LD_LIBRARY_PATH variable
CUDA SETUP: Solution 1a): Find the cuda runtime library via: find / -name libcudart.so 2>/dev/null
CUDA SETUP: Solution 1b): Once the library is found add it to the LD_LIBRARY_PATH: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:FOUND_PATH_FROM_1a
CUDA SETUP: Solution 1c): For a permanent solution add the export from 1b into your .bashrc file, located at ~/.bashrc
CUDA SETUP: Solution 2: If no library was found in step 1a) you need to install CUDA.
CUDA SETUP: Solution 2a): Download CUDA install script: wget https://raw.githubusercontent.com/TimDettmers/bitsandbytes/main/cuda_install.sh
CUDA SETUP: Solution 2b): Install desired CUDA version to desired location. The syntax is bash cuda_install.sh CUDA_VERSION PATH_TO_INSTALL_INTO.
CUDA SETUP: Solution 2b): For example, "bash cuda_install.sh 113 ~/local/" will download CUDA 11.3 and install into the folder ~/local
CUDA SETUP: Setup Failed!
Traceback (most recent call last):
  File "/home/wangdonghua/anaconda3/envs/qwen-7b/lib/python3.8/runpy.py", line 185, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/home/wangdonghua/anaconda3/envs/qwen-7b/lib/python3.8/runpy.py", line 144, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/home/wangdonghua/anaconda3/envs/qwen-7b/lib/python3.8/runpy.py", line 111, in _get_module_details
    __import__(pkg_name)
  File "/home/wangdonghua/anaconda3/envs/qwen-7b/lib/python3.8/site-packages/bitsandbytes/__init__.py", line 6, in <module>
    from . import cuda_setup, utils, research
  File "/home/wangdonghua/anaconda3/envs/qwen-7b/lib/python3.8/site-packages/bitsandbytes/research/__init__.py", line 1, in <module>
    from . import nn
  File "/home/wangdonghua/anaconda3/envs/qwen-7b/lib/python3.8/site-packages/bitsandbytes/research/nn/__init__.py", line 1, in <module>
    from .modules import LinearFP8Mixed, LinearFP8Global
  File "/home/wangdonghua/anaconda3/envs/qwen-7b/lib/python3.8/site-packages/bitsandbytes/research/nn/modules.py", line 8, in <module>
    from bitsandbytes.optim import GlobalOptimManager
  File "/home/wangdonghua/anaconda3/envs/qwen-7b/lib/python3.8/site-packages/bitsandbytes/optim/__init__.py", line 6, in <module>
    from bitsandbytes.cextension import COMPILED_WITH_CUDA
  File "/home/wangdonghua/anaconda3/envs/qwen-7b/lib/python3.8/site-packages/bitsandbytes/cextension.py", line 20, in <module>
    raise RuntimeError('''
RuntimeError: 
        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

### Reproduction

import json
from datasets import Dataset
from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments, BitsAndBytesConfig
from peft import get_peft_model, LoraConfig, TaskType
import torch

# Step 2: 8bit 量化配置
bnb_config = BitsAndBytesConfig(
    load_in_8bit=True,  
    llm_int8_threshold=6.0  
)

# Step 3: 加载模型
model_path = "/home/wangdonghua/Qwen/Qwen-14b/hub/qwen/Qwen2___5-14B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_path)

model = AutoModelForSequenceClassification.from_pretrained(
    model_path,
    num_labels=2,
    quantization_config=bnb_config,  # 启用 8bit
    device_map="auto",
    use_cache=False  # 禁用缓存减少显存占用
)

# Step 4: LoRA 配置
lora_config = LoraConfig(
    task_type=TaskType.SEQ_CLS,
    r=4,
    lora_alpha=16,
    lora_dropout=0.1,
    use_rslora=True  # **启用 rslora**
)
model = get_peft_model(model, lora_config)

# Step 5: 数据 Tokenize 处理
def preprocess_data(examples):
    return tokenizer(
        examples["text"],
        truncation=True,
        padding="max_length",
        max_length=256
    )

tokenized_dataset = dataset.map(preprocess_data, batched=True)




### Expected behavior

help

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The main issue seems to be that the main CUDA runtime library was not detected #1536

System Info

Reproduction

Step 2: 8bit 量化配置

Step 3: 加载模型

Step 4: LoRA 配置

Step 5: 数据 Tokenize 处理

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The main issue seems to be that the main CUDA runtime library was not detected #1536

Description

System Info

Reproduction

Step 2: 8bit 量化配置

Step 3: 加载模型

Step 4: LoRA 配置

Step 5: 数据 Tokenize 处理

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions