Production-grade TreeSHAP implementation in Rust. Computes exact Shapley values for XGBoost, LightGBM, and ONNX tree ensemble models using the tree-path-dependent algorithm (Lundberg et al., 2020).
- Exact SHAP values via tree-path-dependent and interventional modes
- Sub-millisecond latency for single-sample inference (265 us, 100 trees, depth 6)
- Format-independent --- one engine handles XGBoost, LightGBM, and ONNX models
- Parallel by default --- adaptive Rayon parallelism over samples or trees
- Built-in visualization --- waterfall, beeswarm, and feature importance plots (SVG)
- Python bindings via PyO3 with zero-copy NumPy interop
treeshap explain --model model.json --format xgboost --data samples.csv --output shap.json
treeshap verify --explanation shap.json
treeshap plot waterfall --explanation shap.json --output waterfall.svguse treeshap_io::parse_xgboost_json_from_path;
use treeshap_core::Explainer;
let ensemble = parse_xgboost_json_from_path("model.json")?;
let explainer = Explainer::new(&ensemble);
let explanation = explainer.explain(data.view());
assert!(explanation.verify().is_pass());from treeshap import TreeEnsemble, ShapExplainer
model = TreeEnsemble.from_file("model.json", "xgboost")
explainer = ShapExplainer(model)
explanation = explainer.explain(X)
print(explanation.shap_values) # numpy array
svg = explanation.plot_waterfall() # SVG bytesMeasured on Apple M3 (ARM64, 8 cores), release profile with lto = "thin". Full methodology and external comparisons in docs/benchmarks.md.
| Configuration | Total | Per-sample |
|---|---|---|
| 1 sample, 100 trees, depth 6 | 265 us | 265 us |
| 100 samples, 100 trees, depth 6 | 23 ms | 0.23 ms |
| 1,000 samples, 100 trees, depth 6 | 153 ms | 0.15 ms |
| 10,000 samples, 100 trees, depth 6 | 2.8 s | 0.28 ms |
| Implementation | Trees | Total | Per-sample-per-tree |
|---|---|---|---|
| treeshap-rs | 100 | 2.8 s | 2.8 us |
| Python SHAP + XGBoost | 1,000 | 13.0 s | 1.3 us |
| Python SHAP + LightGBM | 1,000 | 42.8 s | 4.3 us |
treeshap-rs is 1.6x faster than Python SHAP + LightGBM per-tree, with no Python runtime, no GIL, and deterministic memory usage. XGBoost's native C++ implementation remains ~2x faster per-tree due to deep integration with its internal tree format.
| Format | Parser | Versions | Notes |
|---|---|---|---|
| XGBoost JSON | parse_xgboost_json |
1.0 -- 2.x | Automatic base_score logit handling for v1.6+ |
| LightGBM text | parse_lightgbm_text |
3.x+ | Numerical splits only; categorical support planned |
| ONNX ML | parse_onnx |
TreeEnsembleClassifier / Regressor | post_transform detection |
treeshap-cli Application layer (CLI binary)
|-- treeshap-core SHAP engine, tree IR, validation
|-- treeshap-io Model parsers (XGBoost, LightGBM, ONNX)
|-- treeshap-viz SVG visualization (waterfall, beeswarm, importance)
treeshap-py Python bindings (PyO3 + maturin)
Each library crate is independently publishable. treeshap-io and treeshap-viz depend on treeshap-core but never on each other. See docs/architecture.md for design rationale.
cargo run --example xgboost_json # XGBoost regression with plots
cargo run --example lgbm_regression # LightGBM regression
cargo run --example binary_classification # Binary classification (log-odds)
cargo run --example missing_values # NaN routing demonstration
cargo run --example linfa_native # Programmatic ensemble construction
cargo run --example onnx # ONNX model (in-memory protobuf)cargo build --workspace --release # Build all crates
cargo test --workspace # Run all tests
cargo bench --bench shap_bench # Criterion benchmarks
cargo clippy --workspace -- -D warnings # Lint
# Python bindings
cd treeshap-py && pip install maturin && maturin develop --release- Categorical splits are rejected with
UnsupportedSplitType(planned for a future release). - ONNX feature count is inferred from split indices; may undercount for models using a feature subset.
- Interventional mode is functional but not yet validated against Python SHAP golden files.
- Python bindings require maturin for building.
Dual-licensed under MIT and Apache 2.0. Choose whichever you prefer.
- Lundberg, S.M., Erion, G., Chen, H. et al. "From local explanations to global understanding with explainable AI for trees." Nature Machine Intelligence 2, 56--67 (2020).
- Lundberg, S.M. & Lee, S.I. "A Unified Approach to Interpreting Model Predictions." NeurIPS (2017).
- Yang, J. "Fast TreeSHAP: Accelerating SHAP Value Computation for Trees." arXiv:2109.09847 (2021).