Skip to content

[CUDA] "Warning: 'Using sparse features with CUDA is currently not supported' in LightGBM with GPU" #6725

@sgapple

Description

@sgapple

Description

When attempting to use LightGBM with GPU support, I receive the warning message:

"Using sparse features with CUDA is currently not supported."

This occurs even though I am providing a dense dataset to the model. I believe this results in a fallback to CPU, which leads to slower training times.

Reproducible example

import numpy as np
from sklearn.model_selection import train_test_split
import lightgbm as lgb

# Set seed for reproducibility
np.random.seed(42)

n_samples = 500 * 10000
n_features = 51

X = np.random.rand(n_samples, n_features).astype(np.float32)
y = np.random.rand(n_samples).astype(np.float32)

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

X_train = np.ascontiguousarray(X_train, dtype=np.float32)
y_train = np.ascontiguousarray(y_train, dtype=np.float32)

new_lgb_train = lgb.Dataset(X_train, label=y_train)

cuda_params = {
    'objective': 'regression',
    'boosting_type': 'dart',
    'colsample_bytree': 0.7,
    'learning_rate': 0.01,
    'max_depth': 7,
    'subsample': 0.7,
    'n_jobs': 32,
    'num_leaves': 63,
    'verbose': 1,
    'device': 'cuda',
    'force_row_wise': True
}

gbm_cuda = lgb.train(cuda_params,
                new_lgb_train,
                num_boost_round=100)

gbm_cuda.predict(X_test)

Environment info

Docker:

  • nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04
  • CUDA version: 12.4

Host:

  • ubuntu22.04
  • GPU: NVIDIA GeForce RTX 3090
  • CPU: intel i9
  • Driver Version: 550.127.05

LightGBM version or commit hash: LightGBM version: 4.5.0

Command(s) you used to install LightGBM

FROM nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04

RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential \
curl \
bzip2 \
ca-certificates \
libglib2.0-0 \
libxext6 \
libsm6 \
libxrender1 \
git \
gnupg \
swig \
vim \
mercurial \
subversion \
python3-dev \
python3-pip \
python3-setuptools \
ocl-icd-opencl-dev \
cmake \
libboost-dev \
libboost-system-dev \
libboost-filesystem-dev \
gcc \
g++ && \
# Install Node.js 18.x
curl -fsSL https://deb.nodesource.com/setup_18.x | bash - && \
apt-get update && \
apt-get install -y nodejs=18.20.4*


# Add OpenCL ICD files for LightGBM
RUN mkdir -p /etc/OpenCL/vendors && \
    echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd

RUN pip3 install --upgrade pip
RUN pip3 install --no-binary lightgbm --config-settings=cmake.define.USE_CUDA=ON lightgbm

Additional Comments

I've tried several installation methods, including cloning from source and using pip for the GPU and CUDA versions. One of the methods failed, while the other resulted in the same warning about sparse features, even with a dense dataset. I would appreciate your assistance, as I am looking to perform an accurate performance comparison between different boosters. Thank you

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions