Skip to content

Commit 98d9f4e

Browse files
authored
chore(examples): Use QuantileDMatrix for histogram tree method in XGBoost example (#3376)
Replace DMatrix with QuantileDMatrix in distributed XGBoost training example when using histogram tree method. This reduces memory usage and aligns with XGBoost best practices for distributed workloads. Fixes #3300 Signed-off-by: Suyash Nalawade <sunalawa@redhat.com>
1 parent 458ab31 commit 98d9f4e

1 file changed

Lines changed: 4 additions & 3 deletions

File tree

examples/xgboost/distributed-training/xgboost-distributed.ipynb

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -102,14 +102,15 @@
102102
" X, y, test_size=0.2, random_state=42,\n",
103103
" )\n",
104104
"\n",
105-
" # NOTE: DMatrix construction MUST be inside the communicator context\n",
105+
" # NOTE: QuantileDMatrix construction MUST be inside the communicator context\n",
106106
" # because it involves cross-worker synchronization for quantization.\n",
107-
" dtrain = xgb.DMatrix(X_train, label=y_train)\n",
108-
" dvalid = xgb.DMatrix(X_valid, label=y_valid)\n",
107+
" dtrain = xgb.QuantileDMatrix(X_train, label=y_train)\n",
108+
" dvalid = xgb.QuantileDMatrix(X_valid, label=y_valid, ref=dtrain)\n",
109109
"\n",
110110
" # Training parameters (for GPU training, add device=\"cuda\").\n",
111111
" params = {\n",
112112
" \"objective\": \"binary:logistic\",\n",
113+
" \"tree_method\": \"hist\",\n",
113114
" \"max_depth\": 6,\n",
114115
" \"eta\": 0.1,\n",
115116
" \"eval_metric\": \"logloss\",\n",

0 commit comments

Comments
 (0)