You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This work is made available under the NVIDIA The MIT License. Click [here](https://github.com/NVlabs/gbrl/blob/master/LICENSE). to view a copy of this license.
GBRL supports learning rate scheduling to control the learning rate throughout training. Two schedulers are available:
346
+
347
+
- **Constant** (default): Fixed learning rate throughout training
348
+
- **Linear**: Linearly interpolates between an initial and final learning rate
349
+
350
+
.. note::
351
+
352
+
Linear scheduler on GPU is only supported for oblivious trees (``grow_policy='oblivious'``).
353
+
354
+
Constant Scheduler (Default)
355
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
356
+
357
+
.. code-block:: python
358
+
359
+
# Constant learning rate (default behavior)
360
+
optimizer = {
361
+
'algo': 'SGD',
362
+
'lr': 0.1, # Fixed learning rate
363
+
'start_idx': 0,
364
+
'stop_idx': out_dim
365
+
}
366
+
367
+
Linear Scheduler
368
+
~~~~~~~~~~~~~~~~
369
+
370
+
The linear scheduler interpolates the learning rate from ``lr`` (initial) to ``stop_lr`` (final) over ``T`` trees:
371
+
372
+
.. math::
373
+
374
+
lr_t = lr + \frac{t}{T} \times (stop\_lr - lr)
375
+
376
+
where :math:`t` is the current tree index (0-indexed, so :math:`t \in [0, T-1]`). The schedule covers trees 0 through T-1, and at tree T and beyond, the learning rate is held constant at ``stop_lr``. This means:
377
+
378
+
- At tree 0: :math:`lr_0 = lr` (initial learning rate)
379
+
- At tree T-1: :math:`lr_{T-1} = lr + \frac{T-1}{T} \times (stop\_lr - lr)` (approaching final learning rate)
380
+
- At tree T and beyond: :math:`lr_t = stop\_lr` (held constant)
381
+
382
+
**Edge Case (T=1):** When ``T=1``, the schedule contains only tree 0 which uses ``lr`` (since :math:`lr_0 = lr + 0/1\times (stop\_lr - lr) = lr`). The interpolation phase is skipped, so tree 1 and all subsequent trees immediately use ``stop_lr``.
383
+
384
+
**Parameter Constraints:**
385
+
386
+
- ``T`` must be a positive integer (minimum 1). It should equal the number of trees you expect to build.
387
+
- ``lr`` and ``stop_lr`` must be positive floats. ``stop_lr`` can be greater than ``lr`` for warming schedules.
388
+
- At tree T and for all subsequent trees, the scheduler holds at ``stop_lr``.
389
+
390
+
.. code-block:: python
391
+
392
+
# Linear learning rate decay from 0.1 to 0.01 over 100 trees
393
+
optimizer = {
394
+
'algo': 'SGD',
395
+
'lr': 0.1, # Initial learning rate
396
+
'stop_lr': 0.01, # Final learning rate
397
+
'T': 100, # Number of trees for the schedule
398
+
'scheduler': 'Linear',
399
+
'start_idx': 0,
400
+
'stop_idx': out_dim
401
+
}
402
+
403
+
tree_struct = {
404
+
'max_depth': 4,
405
+
'n_bins': 256,
406
+
'min_data_in_leaf': 0,
407
+
'par_th': 2,
408
+
'grow_policy': 'oblivious'# Required for GPU linear scheduler
409
+
}
410
+
411
+
gbt_model = GBTModel(
412
+
input_dim=input_dim,
413
+
output_dim=out_dim,
414
+
tree_struct=tree_struct,
415
+
optimizers=optimizer,
416
+
params=gbrl_params,
417
+
verbose=1,
418
+
device=device
419
+
)
420
+
421
+
Monotonic Constraints
422
+
---------------------
423
+
Monotonic constraints enforce that the model output is monotonically increasing or decreasing with respect to specific input features. This is useful for incorporating domain knowledge or ensuring interpretable behavior.
424
+
425
+
.. note::
426
+
427
+
Monotonic constraints are only supported for **oblivious trees** (``grow_policy='oblivious'``).
428
+
Constraints apply to the output dimensions defined by ``start_idx`` to ``stop_idx-1`` in the
429
+
optimizer configuration. For ``GBTModel``, this typically covers all outputs. For actor-critic
430
+
models, constraints affect only the policy outputs (not value function outputs).
431
+
432
+
**How Constraints Are Enforced:**
433
+
434
+
Monotonic constraints are enforced through two mechanisms:
435
+
436
+
1. **During tree growing:** Incompatible splits are rejected or pruned to preserve monotonicity.
437
+
The constraint-aware scoring function pools the left and right child means when a split
438
+
would violate the monotonic ordering, effectively reducing the score of such splits.
439
+
440
+
2. **After each tree is built:** Gradient-based updates that would violate constraints are
441
+
projected or clipped by the optimizer for the affected output indices (``start_idx`` to
442
+
``stop_idx-1``). A pool-adjacent-violators (PAVA) algorithm is applied to ensure leaf
443
+
values respect the specified monotonic ordering.
444
+
445
+
**Practical Trade-offs:**
446
+
447
+
- Split search may be slower due to constraint checking and mean pooling during scoring
448
+
- Convergence may be affected for ``GBTModel`` and actor-critic models (policy outputs only)
449
+
since some gradient directions are restricted
450
+
- The constraint projection ensures predictions are monotonic but may result in suboptimal
451
+
fit compared to unconstrained models
452
+
453
+
Setting Monotonic Constraints
454
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
455
+
456
+
Constraints are specified as a dictionary mapping feature indices to constraint specifications:
457
+
458
+
.. code-block:: python
459
+
460
+
constraints = {
461
+
feature_index: (direction, output_indices),
462
+
...
463
+
}
464
+
465
+
Where:
466
+
467
+
- ``feature_index``: The input feature to constrain (int or numpy integer type)
468
+
- ``direction``: The constraint direction. All three forms are accepted as valid inputs for each direction:
469
+
``'increasing'``, ``'+'``, or ``1`` for increasing constraints, and
470
+
``'decreasing'``, ``'-'``, or ``-1`` for decreasing constraints.
471
+
- ``output_indices``: Single int or list of output dimensions to apply the constraint to
472
+
473
+
Only specify the features you want to constrain - unlisted features have no constraints.
474
+
475
+
.. code-block:: python
476
+
477
+
from gbrl.models import GBTModel
478
+
479
+
input_dim =4
480
+
out_dim =2
481
+
482
+
# Feature 0: increasing for output 0
483
+
# Feature 1: decreasing for outputs 0 and 1
484
+
# Features 2-3: no constraints (not listed)
485
+
monotonic_constraints = {
486
+
0: ("increasing", 0),
487
+
1: ("decreasing", [0, 1]),
488
+
}
489
+
490
+
tree_struct = {
491
+
'max_depth': 4,
492
+
'n_bins': 256,
493
+
'min_data_in_leaf': 0,
494
+
'par_th': 2,
495
+
'grow_policy': 'oblivious'# Required for monotonic constraints
496
+
}
497
+
498
+
# Specify which output dimensions to optimize (constraints apply to indices start_idx to stop_idx-1)
499
+
optimizer = {
500
+
'algo': 'SGD',
501
+
'lr': 0.1,
502
+
'start_idx': 0,
503
+
'stop_idx': out_dim # constraints apply to output indices 0 to stop_idx-1
504
+
}
505
+
506
+
gbrl_params = {
507
+
'split_score_func': 'Cosine',
508
+
'generator_type': 'Quantile'
509
+
}
510
+
511
+
gbt_model = GBTModel(
512
+
input_dim=input_dim,
513
+
output_dim=out_dim,
514
+
tree_struct=tree_struct,
515
+
optimizers=optimizer,
516
+
params=gbrl_params,
517
+
monotonic_constraints=monotonic_constraints,
518
+
verbose=1,
519
+
device='cuda'# GPU supported for oblivious trees
520
+
)
521
+
522
+
Combining Schedulers and Constraints
523
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
524
+
525
+
Monotonic constraints and linear schedulers can be used together:
Copy file name to clipboardExpand all lines: docs/index.rst
+23Lines changed: 23 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,29 @@ Welcome to GBRL's documentation!
7
7
================================
8
8
GBRL is a Python-based Gradient Boosting Trees (GBT) library, similar to popular packages such as `XGBoost <https://xgboost.readthedocs.io/en/stable/>`__ , `CatBoost <https://catboost.ai/>`__ , but specifically designed and optimized for reinforcement learning (RL). GBRL is implemented in C++/CUDA aimed to seamlessly integrate within popular RL libraries.
9
9
10
+
Feature Support Matrix
11
+
----------------------
12
+
13
+
The following table summarizes feature availability by tree type and device:
0 commit comments