Skip to content

Commit 1b1e631

Browse files
authored
Monotonic Constraints (#23)
* stash changes * added compression * added compression * removed mistake * code refactoring * adding cpu monotonic constraints * adding cuda support * fixing mono constraints * finalized changes * added serialization tests * fixed parsing error * fixing minor mistakes * Fix remaining code review issues: unused variable and f-string * Add docstrings and convert empty test to skipTest - Add docstrings to setUpClass/setUp methods for coverage - Convert test_monotonic_mixed_dataset_categorical_rejection to skipTest - Improve existing docstrings with more detail * added compression * fixing bugs * fixed bugs * fixed issues * fixed CR issues * fixing CR * final change * fixed latest CR issues * fixed CR * forgot * fixed error in tests * fixing more CR suggestions
1 parent 0548ac1 commit 1b1e631

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

78 files changed

+8568
-1324
lines changed

.coderabbit.yaml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Repository-specific override
2+
reviews:
3+
profile: "assertive"
4+
auto_review:
5+
enabled: true
6+
auto_incremental_review: true
7+
drafts: false
8+
ignore_title_keywords:
9+
- "WIP"
10+
- "DO NOT MERGE"

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,5 @@ docs/_build/
1818
*.whl
1919
*ncu-rep
2020
*.sh
21-
local_binaries/
21+
local_binaries/
22+
tmp/

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MIT License
22

3-
Copyright (c) 2024-2025, NVIDIA Corporation. All rights reserved.
3+
Copyright (c) 2024-2026, NVIDIA Corporation. All rights reserved.
44

55
Permission is hereby granted, free of charge, to any person obtaining a
66
copy of this software and associated documentation files (the "Software"),

README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,12 +54,15 @@ For a detailed usage example, see `tutorial.ipynb`
5454
- MultiRMSE loss (only)
5555
- Categorical inputs
5656
- Input feature weights - (CPU/GPU)
57+
- Monotonic constraints - (CPU/GPU, policy only)
58+
5759
### GBT Inference
60+
5861
- SGD optimizer - (CPU/GPU)
5962
- ADAM optimizer - (CPU only)
6063
- Control Variates (gradient variance reduction technique) - (CPU only)
6164
- Shared Tree for policy and value function - (CPU/GPU)
62-
- Linear and constant learning rate scheduler - (CPU/GPU only constant)
65+
- Linear and constant learning rate scheduler - (CPU/GPU, linear scheduler GPU only for Oblivious trees)
6366
- Support for up to two different optimizers (e.g, policy/value) - **(CPU/GPU if both are SGD)
6467
- SHAP value calculation
6568

@@ -81,7 +84,7 @@ url={https://arxiv.org/abs/2407.08250}
8184
}
8285
```
8386
# Licenses
84-
Copyright © 2024-2025, NVIDIA Corporation. All rights reserved.
87+
Copyright © 2024-2026, NVIDIA Corporation. All rights reserved.
8588

8689
This work is made available under the NVIDIA The MIT License. Click [here](https://github.com/NVlabs/gbrl/blob/master/LICENSE). to view a copy of this license.
8790

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ def __getattr__(cls, name):
2929

3030
sys.modules['gbrl.gbrl_cpp'] = Mock()
3131
project = 'GBRL'
32-
copyright = '2024-2025, NVIDIA Corporation'
32+
copyright = '2024-2026, NVIDIA Corporation'
3333
author = 'Benjamin Fuhrer, Chen Tessler, Gal Dalal'
3434
release = __version__
3535
version = "master (" + __version__ + " )"

docs/examples.rst

Lines changed: 221 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -338,4 +338,224 @@ SHAP values are calculated internally and can be plotted using the `SHAP library
338338
shap.plots.bar(explainable_values_action_2, ax=ax)
339339
ax.set_title("SHAP values Action 2")
340340
341-
plt.show()
341+
plt.show()
342+
343+
Learning Rate Schedulers
344+
------------------------
345+
GBRL supports learning rate scheduling to control the learning rate throughout training. Two schedulers are available:
346+
347+
- **Constant** (default): Fixed learning rate throughout training
348+
- **Linear**: Linearly interpolates between an initial and final learning rate
349+
350+
.. note::
351+
352+
Linear scheduler on GPU is only supported for oblivious trees (``grow_policy='oblivious'``).
353+
354+
Constant Scheduler (Default)
355+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
356+
357+
.. code-block:: python
358+
359+
# Constant learning rate (default behavior)
360+
optimizer = {
361+
'algo': 'SGD',
362+
'lr': 0.1, # Fixed learning rate
363+
'start_idx': 0,
364+
'stop_idx': out_dim
365+
}
366+
367+
Linear Scheduler
368+
~~~~~~~~~~~~~~~~
369+
370+
The linear scheduler interpolates the learning rate from ``lr`` (initial) to ``stop_lr`` (final) over ``T`` trees:
371+
372+
.. math::
373+
374+
lr_t = lr + \frac{t}{T} \times (stop\_lr - lr)
375+
376+
where :math:`t` is the current tree index (0-indexed, so :math:`t \in [0, T-1]`). The schedule covers trees 0 through T-1, and at tree T and beyond, the learning rate is held constant at ``stop_lr``. This means:
377+
378+
- At tree 0: :math:`lr_0 = lr` (initial learning rate)
379+
- At tree T-1: :math:`lr_{T-1} = lr + \frac{T-1}{T} \times (stop\_lr - lr)` (approaching final learning rate)
380+
- At tree T and beyond: :math:`lr_t = stop\_lr` (held constant)
381+
382+
**Edge Case (T=1):** When ``T=1``, the schedule contains only tree 0 which uses ``lr`` (since :math:`lr_0 = lr + 0/1 \times (stop\_lr - lr) = lr`). The interpolation phase is skipped, so tree 1 and all subsequent trees immediately use ``stop_lr``.
383+
384+
**Parameter Constraints:**
385+
386+
- ``T`` must be a positive integer (minimum 1). It should equal the number of trees you expect to build.
387+
- ``lr`` and ``stop_lr`` must be positive floats. ``stop_lr`` can be greater than ``lr`` for warming schedules.
388+
- At tree T and for all subsequent trees, the scheduler holds at ``stop_lr``.
389+
390+
.. code-block:: python
391+
392+
# Linear learning rate decay from 0.1 to 0.01 over 100 trees
393+
optimizer = {
394+
'algo': 'SGD',
395+
'lr': 0.1, # Initial learning rate
396+
'stop_lr': 0.01, # Final learning rate
397+
'T': 100, # Number of trees for the schedule
398+
'scheduler': 'Linear',
399+
'start_idx': 0,
400+
'stop_idx': out_dim
401+
}
402+
403+
tree_struct = {
404+
'max_depth': 4,
405+
'n_bins': 256,
406+
'min_data_in_leaf': 0,
407+
'par_th': 2,
408+
'grow_policy': 'oblivious' # Required for GPU linear scheduler
409+
}
410+
411+
gbt_model = GBTModel(
412+
input_dim=input_dim,
413+
output_dim=out_dim,
414+
tree_struct=tree_struct,
415+
optimizers=optimizer,
416+
params=gbrl_params,
417+
verbose=1,
418+
device=device
419+
)
420+
421+
Monotonic Constraints
422+
---------------------
423+
Monotonic constraints enforce that the model output is monotonically increasing or decreasing with respect to specific input features. This is useful for incorporating domain knowledge or ensuring interpretable behavior.
424+
425+
.. note::
426+
427+
Monotonic constraints are only supported for **oblivious trees** (``grow_policy='oblivious'``).
428+
Constraints apply to the output dimensions defined by ``start_idx`` to ``stop_idx-1`` in the
429+
optimizer configuration. For ``GBTModel``, this typically covers all outputs. For actor-critic
430+
models, constraints affect only the policy outputs (not value function outputs).
431+
432+
**How Constraints Are Enforced:**
433+
434+
Monotonic constraints are enforced through two mechanisms:
435+
436+
1. **During tree growing:** Incompatible splits are rejected or pruned to preserve monotonicity.
437+
The constraint-aware scoring function pools the left and right child means when a split
438+
would violate the monotonic ordering, effectively reducing the score of such splits.
439+
440+
2. **After each tree is built:** Gradient-based updates that would violate constraints are
441+
projected or clipped by the optimizer for the affected output indices (``start_idx`` to
442+
``stop_idx-1``). A pool-adjacent-violators (PAVA) algorithm is applied to ensure leaf
443+
values respect the specified monotonic ordering.
444+
445+
**Practical Trade-offs:**
446+
447+
- Split search may be slower due to constraint checking and mean pooling during scoring
448+
- Convergence may be affected for ``GBTModel`` and actor-critic models (policy outputs only)
449+
since some gradient directions are restricted
450+
- The constraint projection ensures predictions are monotonic but may result in suboptimal
451+
fit compared to unconstrained models
452+
453+
Setting Monotonic Constraints
454+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
455+
456+
Constraints are specified as a dictionary mapping feature indices to constraint specifications:
457+
458+
.. code-block:: python
459+
460+
constraints = {
461+
feature_index: (direction, output_indices),
462+
...
463+
}
464+
465+
Where:
466+
467+
- ``feature_index``: The input feature to constrain (int or numpy integer type)
468+
- ``direction``: The constraint direction. All three forms are accepted as valid inputs for each direction:
469+
``'increasing'``, ``'+'``, or ``1`` for increasing constraints, and
470+
``'decreasing'``, ``'-'``, or ``-1`` for decreasing constraints.
471+
- ``output_indices``: Single int or list of output dimensions to apply the constraint to
472+
473+
Only specify the features you want to constrain - unlisted features have no constraints.
474+
475+
.. code-block:: python
476+
477+
from gbrl.models import GBTModel
478+
479+
input_dim = 4
480+
out_dim = 2
481+
482+
# Feature 0: increasing for output 0
483+
# Feature 1: decreasing for outputs 0 and 1
484+
# Features 2-3: no constraints (not listed)
485+
monotonic_constraints = {
486+
0: ("increasing", 0),
487+
1: ("decreasing", [0, 1]),
488+
}
489+
490+
tree_struct = {
491+
'max_depth': 4,
492+
'n_bins': 256,
493+
'min_data_in_leaf': 0,
494+
'par_th': 2,
495+
'grow_policy': 'oblivious' # Required for monotonic constraints
496+
}
497+
498+
# Specify which output dimensions to optimize (constraints apply to indices start_idx to stop_idx-1)
499+
optimizer = {
500+
'algo': 'SGD',
501+
'lr': 0.1,
502+
'start_idx': 0,
503+
'stop_idx': out_dim # constraints apply to output indices 0 to stop_idx-1
504+
}
505+
506+
gbrl_params = {
507+
'split_score_func': 'Cosine',
508+
'generator_type': 'Quantile'
509+
}
510+
511+
gbt_model = GBTModel(
512+
input_dim=input_dim,
513+
output_dim=out_dim,
514+
tree_struct=tree_struct,
515+
optimizers=optimizer,
516+
params=gbrl_params,
517+
monotonic_constraints=monotonic_constraints,
518+
verbose=1,
519+
device='cuda' # GPU supported for oblivious trees
520+
)
521+
522+
Combining Schedulers and Constraints
523+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
524+
525+
Monotonic constraints and linear schedulers can be used together:
526+
527+
.. code-block:: python
528+
529+
monotonic_constraints = {
530+
0: ("increasing", 0),
531+
1: ("decreasing", [0, 1]),
532+
}
533+
534+
optimizer = {
535+
'algo': 'SGD',
536+
'lr': 0.1,
537+
'stop_lr': 0.01,
538+
'T': 100,
539+
'scheduler': 'Linear',
540+
'start_idx': 0,
541+
'stop_idx': out_dim
542+
}
543+
544+
tree_struct = {
545+
'max_depth': 4,
546+
'n_bins': 256,
547+
'min_data_in_leaf': 0,
548+
'par_th': 2,
549+
'grow_policy': 'oblivious'
550+
}
551+
552+
gbt_model = GBTModel(
553+
input_dim=input_dim,
554+
output_dim=out_dim,
555+
tree_struct=tree_struct,
556+
optimizers=optimizer,
557+
params=gbrl_params,
558+
monotonic_constraints=monotonic_constraints,
559+
verbose=1,
560+
device='cuda'
561+
)

docs/index.rst

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,29 @@ Welcome to GBRL's documentation!
77
================================
88
GBRL is a Python-based Gradient Boosting Trees (GBT) library, similar to popular packages such as `XGBoost <https://xgboost.readthedocs.io/en/stable/>`__ , `CatBoost <https://catboost.ai/>`__ , but specifically designed and optimized for reinforcement learning (RL). GBRL is implemented in C++/CUDA aimed to seamlessly integrate within popular RL libraries.
99

10+
Feature Support Matrix
11+
----------------------
12+
13+
The following table summarizes feature availability by tree type and device:
14+
15+
+----------------------------+---------------------+-------------------+---------------------+-------------------+
16+
| Feature | Greedy CPU | Greedy GPU | Oblivious CPU | Oblivious GPU |
17+
+============================+=====================+===================+=====================+===================+
18+
| Tree Fitting |||||
19+
+----------------------------+---------------------+-------------------+---------------------+-------------------+
20+
| Monotonic Constraints ||| ✓ (policy only) | ✓ (policy only) |
21+
+----------------------------+---------------------+-------------------+---------------------+-------------------+
22+
| Linear LR Scheduler |||||
23+
+----------------------------+---------------------+-------------------+---------------------+-------------------+
24+
| Constant LR Scheduler |||||
25+
+----------------------------+---------------------+-------------------+---------------------+-------------------+
26+
| ADAM Optimizer |||||
27+
+----------------------------+---------------------+-------------------+---------------------+-------------------+
28+
| SGD Optimizer |||||
29+
+----------------------------+---------------------+-------------------+---------------------+-------------------+
30+
| Control Variates |||||
31+
+----------------------------+---------------------+-------------------+---------------------+-------------------+
32+
1033
.. toctree::
1134
:maxdepth: 2
1235
:caption: User Guide:

gbrl/__init__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
##############################################################################
2-
# Copyright (c) 2024-2025, NVIDIA Corporation. All rights reserved.
2+
# Copyright (c) 2024-2026, NVIDIA Corporation. All rights reserved.
33
#
44
# Permission is hereby granted, free of charge, to any person obtaining a
55
# copy of this software and associated documentation files (the "Software"),
@@ -19,7 +19,7 @@
1919
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
2020
# DEALINGS IN THE SOFTWARE.
2121
##############################################################################
22-
__version__ = "1.1.6"
22+
__version__ = "1.1.7"
2323

2424
import importlib.util
2525
import os

gbrl/common/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
##############################################################################
2-
# Copyright (c) 2024-2025, NVIDIA Corporation. All rights reserved.
2+
# Copyright (c) 2024-2026, NVIDIA Corporation. All rights reserved.
33
#
44
# Permission is hereby granted, free of charge, to any person obtaining a
55
# copy of this software and associated documentation files (the "Software"),

0 commit comments

Comments
 (0)