Skip to content

15% Random Forest and 30% Sparse Oblique speedup by restructuring Split Search in Random Histogram#356

Open
ariellubonja wants to merge 1 commit into
google:mainfrom
ariellubonja:faster-select-threshold-histogram
Open

15% Random Forest and 30% Sparse Oblique speedup by restructuring Split Search in Random Histogram#356
ariellubonja wants to merge 1 commit into
google:mainfrom
ariellubonja:faster-select-threshold-histogram

Conversation

@ariellubonja

@ariellubonja ariellubonja commented May 7, 2026

Copy link
Copy Markdown
Contributor

Hi again Richard & Mathieu!

More significant and interesting speedup: pulling invariants out of the Random Histogram's Scan candidate thresholds loop gives 15% speedup in Axis-aligned RF (Random Hist) and 30% in Sparse Oblique across many datasets, and accuracy is identical save for floating-point differences (across 35 datasets x 10 different seeds).

image

Per-function speedup:

In my CHRONO times below, Scanning Thresholds is only 20% of the overall time, yet the end-to-end speedup is 30%+. This suggests my CHRONO has overhead that distorts the underlying relative weights of functions.

image

Full runtime tests + Accuracy

@rstz

rstz commented May 8, 2026

Copy link
Copy Markdown
Collaborator

Hi Ariel, thank you very much, this looks very interesting.

I quickly ran a very limited benchmark of end-to-end datasets and didn't see an improvement yet

name                                       cpu/op      cpu/op    vs base
BM_Train_GBDT_Oblique_Adult                 9.733 ± 1%   9.771 ± 1%       ~ (p=0.240 n=6)
BM_Train_RF_Oblique_Adult                   3.019 ± 1%   3.036 ± 1%       ~ (p=0.180 n=6)
BM_Train_GBDT_Oblique_Synthetic/10k/20/2    3.310 ± 2%   3.326 ± 0%       ~ (p=0.093 n=6)
BM_Train_GBDT_Oblique_Synthetic/98k/20/2    35.23 ± 1%   35.42 ± 0%  +0.54% (p=0.041 n=6)
geomean                                    7.651        7.689       +0.49%

name                                       time/op       time/op     vs base
BM_Train_GBDT_Oblique_Adult                 2.877 ± 1%    2.890 ± 1%       ~ (p=0.394 n=6)
BM_Train_RF_Oblique_Adult                  618.7m ± 3%   623.5m ± 1%       ~ (p=0.240 n=6)
BM_Train_GBDT_Oblique_Synthetic/10k/20/2   592.8m ± 1%   591.4m ± 0%       ~ (p=0.589 n=6)
BM_Train_GBDT_Oblique_Synthetic/98k/20/2    6.185 ± 1%    6.224 ± 1%  +0.64% (p=0.026 n=6)
geomean                                    1.598         1.605       +0.41%

name                                       INSTRUCTIONS/op  INSTRUCTIONS/op  vs base
BM_Train_GBDT_Oblique_Adult                98.42G ± 0%       98.42G ± 0%       ~ (p=0.818 n=6)
BM_Train_RF_Oblique_Adult                  31.97G ± 0%       31.97G ± 0%       ~ (p=0.394 n=6)
BM_Train_GBDT_Oblique_Synthetic/10k/20/2   37.99G ± 0%       37.99G ± 0%       ~ (p=0.589 n=6)
BM_Train_GBDT_Oblique_Synthetic/98k/20/2   407.9G ± 0%       407.9G ± 0%       ~ (p=0.093 n=6)
geomean                                    83.56G            83.56G       -0.00%

name                                       CYCLES/op    CYCLES/op   vs base
BM_Train_GBDT_Oblique_Adult                32.43G ± 0%   32.63G ± 1%       ~ (p=0.132 n=6)
BM_Train_RF_Oblique_Adult                  10.63G ± 0%   10.64G ± 1%       ~ (p=0.310 n=6)
BM_Train_GBDT_Oblique_Synthetic/10k/20/2   11.51G ± 0%   11.56G ± 0%  +0.44% (p=0.004 n=6)
BM_Train_GBDT_Oblique_Synthetic/98k/20/2   123.6G ± 0%   124.2G ± 0%  +0.45% (p=0.002 n=6)
geomean                                    26.46G        26.57G       +0.41%

I'll re-run on your datasets and share what I see. Maybe I can also finally opensource some of the code we have for end-to-end benchmarks to make comparisons easier

@ariellubonja

Copy link
Copy Markdown
Contributor Author

Hi Richard! I figured it out! Sparse Oblique with Random Histograms is not in your YDF - I have added that and forgot. It seems your YDF Sparse Oblique has only Exact. Good news: I re-ran with Axis-aligned and the benefit is there! Edited the PR

@ariellubonja ariellubonja changed the title 30% end-to-end speedup by restructuring Split Search in Sparse Oblique Random 15% Random Forest and 30% Sparse Oblique speedup by restructuring Split Search in Random Histogram May 10, 2026
@ariellubonja

Copy link
Copy Markdown
Contributor Author

It's also interesting to check whether the Regression branch would benefit by this or a similar approach

@ariellubonja

Copy link
Copy Markdown
Contributor Author

Hi Richard! You can see the speedup now for this PR?

@ariellubonja ariellubonja deleted the faster-select-threshold-histogram branch May 27, 2026 18:09
@ariellubonja ariellubonja restored the faster-select-threshold-histogram branch May 30, 2026 16:21
@ariellubonja ariellubonja reopened this Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants