Skip to content

Implement Feistel-based permutation for LHS and convert samplers to stateless RNG#32833

Open
zachmprince wants to merge 19 commits intoidaholab:nextfrom
zachmprince:new_latin_sampling
Open

Implement Feistel-based permutation for LHS and convert samplers to stateless RNG#32833
zachmprince wants to merge 19 commits intoidaholab:nextfrom
zachmprince:new_latin_sampling

Conversation

@zachmprince
Copy link
Copy Markdown
Contributor

@zachmprince zachmprince commented Apr 24, 2026

Closes #32775, refs #32194

Reason

Latin Hypercube Sampling (LHS) previously relied on a stateful shuffle() call
inside sampleSetUp() to permute bin assignments. This coupling between the
generator state-machine and the sample-matrix loop made LHS incompatible with
the stateless RNG direction, prevented parallel or out-of-order sample access,
and required save/restore of generator state around every call to
sampleSetUp/sampleTearDown. The same stateful pattern was shared by the
MCMC and active-learning samplers, making the entire Sampler hierarchy harder
to reason about and test.

Design

MooseRandomPerturbation (new framework utility)

A header-only class implementing a keyed pseudo-random permutation of the
integers [0, n) using a balanced Feistel network:

  • The 64-bit seed is split into two 32-bit subkeys; each round mixes the
    half-block with both subkeys and a golden-ratio-derived constant, then applies
    the Murmur3/degski avalanche hash for bit diffusion.
  • Because n need not be a power of two the network operates on the smallest
    padded domain 2^(2*half_bits) >= n and uses cycle-walking to reject
    out-of-range outputs.
  • The permutation is bijective (every input maps to a unique output) and
    invertible (invert(permute(x)) == x).
  • Verified by five new unit tests in unit/src/MooseRandomPerturbationTest.C
    (bijection, invertibility, bit-width range, seed uniqueness, reproducibility).

Redesigned LatinHypercubeSampler

The sampler now uses two stateless generators:

  • Generator 0 — draws the uniform point inside the selected bin.
  • Generator 1 — seeds one MooseRandomPerturbation permuter per column.

In computeSampleRow(row, col) the bin for sample row in column col is
permuter[col].permute(row), so the full LHS sample matrix is determined
entirely by the two generator seeds. No state needs to be saved, restored, or
advanced around a setup callback. Permuters are initialised once in
executeTearDown() after the generator has been advanced to the correct offset.

Sampler base-class cleanup

sampleSetUp() and sampleTearDown() (per-row callbacks that ran inside the
sample-matrix loop and required generator save/restore) have been removed. The
simpler executeSetUp() / executeTearDown() pair (called once before/after
the entire execute()) is sufficient for all remaining use-cases.

The CommMethod enum, shuffle() template, and saveGeneratorState() /
restoreGeneratorState() methods have been removed along with it.

MCMC and active-learning samplers

PMCMCBase and every derived class that previously relied on sampleSetUp to
seed proposals now use executeSetUp() instead. The proposeSamples() interface
no longer takes a seed_value argument; three new helper methods (random(),
randomIndex(), randomIndexPair()) encapsulate stateless index-based draws so
each sampler can advance its own _rand_index counter without touching generator
state directly.

AdaptiveImportanceSampler had an off-by-one in its sample index that is also
corrected here.

Stateless conversion for remaining samplers

MorrisSampler and NestedMonteCarloSampler were also converted from the
stateful shuffle()/sampleSetUp() pattern to the stateless generator API.

Impact

Framework API changes

Removed Replacement
Sampler::sampleSetUp(SampleMode) Sampler::executeSetUp()
Sampler::sampleTearDown(SampleMode) Sampler::executeTearDown()
Sampler::shuffle(begin, end, generator_index) MooseRandomPerturbation
Sampler::saveGeneratorState() / restoreGeneratorState() not needed with stateless RNG
Sampler::CommMethod enum (LOCAL, SEMI_LOCAL, NONE) removed
PMCMCBase::proposeSamples(seed_value, ...) proposeSamples(...) (no seed arg)

New framework utility: framework/include/utils/MooseRandomPerturbation.h
(header-only, no new .C file, no registration macro required).

Gold files for all LHS tests, several MCMC tests, and surrogate-training tests
that pass through an LHS sampler have been regolded because the new Feistel
permutation and stateless generator advancement produce a different (but equally
valid and reproducible) sample sequence.

To-Do

Things to do once reviewers are satisfied with changes.

Update figures, tables, etc. in documentation regarding LHS sampling changes:

  • modules/stochastic_tools/examples/parameter_study.md
  • modules/stochastic_tools/examples/nonlin_parameter_study.md
  • modules/stochastic_tools/examples/sobol.md
  • modules/stochastic_tools/examples/poly_regression_surrogate.md
  • modules/stochastic_tools/examples/pod_rb_surrogate.md
  • modules/stochastic_tools/examples/combined_example_2d_trans_diff.md
  • modules/stochastic_tools/examples/cross_validation.md
  • modules/combined/examples/stm_thermomechanics.md

Update applications:

  • Grizzly

This works and is consistent in parallel, but fails most tests due to algorithmic change.

Integration with MooseRandomStateless will probably require more regolding, so I'm saving this part for later.

Refs idaholab#32194
@zachmprince
Copy link
Copy Markdown
Contributor Author

LHS quality experiment: Feistel permutation vs. Fisher-Yates shuffle

modules/stochastic_tools/examples/lhs/lhs_experiment.{py,ipynb} contains a
numerical comparison of four samplers run against a parametric test function
family.

Test function. Each trial draws an nrow x ndim sample matrix and evaluates

$$ f(\mathbf{x}; \lambda) = \sum_c \frac{e^{x_c}}{2^{c+1}} + \lambda \cdot \frac{\sum_{{i}<{j}} L_2(x_i), L_2(x_j)}{\sqrt{d(d-1)/2}} $$

where $L_2(x) = \sqrt{12}(x - 0.5)$ is the centered Legendre polynomial. The
additive term ($\lambda = 0$) is well-suited to LHS; larger $\lambda$ values add
pairwise interaction that LHS cannot fully stratify, stress-testing each
sampler's behavior when the function is less cooperative. The exact mean of $f$
over $[0,1]^d$ is known analytically, so RMSE is computed directly.

Parameters. 4 samplers x 4 dimensions (4, 8, 16, 32) x 10 sample counts
(8 -- 4096, log-spaced) x 6 $\lambda$ values (0, 0.1, 0.3, 1, 3, 10) x 100
independent trials = 96 000 runs. The Feistel LHS calls the actual
stochastic_tools-opt binary; the others use scipy.stats.qmc.

Result. Across all 240 $(\lambda, d, N)$ combinations the ratio of Feistel
RMSE to Fisher-Yates RMSE has a mean of 0.99 and a standard deviation of
0.10
, with a range of 0.81 -- 1.28. The extreme ratios appear only at the
smallest sample counts (nrow = 8) and highest dimensions (ndim = 32), where
both methods have large absolute RMSE and the ratio is dominated by Monte Carlo
noise across the 100 trials. At $N \ge 32$ the two LHS variants are
indistinguishable. The table below shows the $\lambda = 0$, $d = 8$ slice as a
representative example:

nrow Monte Carlo LHS (Fisher-Yates) LHS (Feistel) Sobol
8 0.110472 0.012540 0.013309 0.011746
16 0.064504 0.004484 0.004344 0.004860
32 0.049537 0.001694 0.001605 0.001547
64 0.029627 0.000548 0.000574 0.000454
128 0.025555 0.000186 0.000216 0.000186
256 0.015685 0.000075 0.000070 0.000053
512 0.012246 0.000027 0.000027 0.000022
1024 0.009555 0.000007 0.000009 0.000011
2048 0.006514 0.000003 0.000003 0.000005
4096 0.003950 0.000001 0.000001 0.000000

Both LHS variants produce the correct $O(N^{-1})$ convergence rate and nearly
identical absolute errors, confirming that the Feistel-network permutation is a
drop-in replacement for the Fisher-Yates shuffle used by scipy (and by the
previous stochastic_tools implementation) with no measurable quality penalty.

@zachmprince
Copy link
Copy Markdown
Contributor Author

Here is a plot of the experiment results. The LHS lines (blue and red) are difficult to see since they are right on top of each other.
image

@moosebuild
Copy link
Copy Markdown
Contributor

Job Precheck, step Python: black format on af7f601 wanted to post the following:

Python black formatting

Your code requires style changes.

A patch was generated and copied here.

You can directly apply the patch by running the following at the top level of your repository:

curl -s https://mooseframework.inl.gov/docs/PRs/32833/black/black.patch | git apply -v

Alternatively, you can run the following at the top level of your repository:

black --config pyproject.toml --workers 1 .

@moosebuild
Copy link
Copy Markdown
Contributor

Job Precheck, step Clang format on f298501 wanted to post the following:

Your code requires style changes.

A patch was auto generated and copied here
You can directly apply the patch by running, in the top level of your repository:

curl -s https://mooseframework.inl.gov/docs/PRs/32833/clang_format/style.patch | git apply -v

Alternatively, with your repository up to date and in the top level of your repository:

git clang-format 162e78c71709df5be85663d216c89383cf162192

@zachmprince zachmprince force-pushed the new_latin_sampling branch 3 times, most recently from 5ba8c63 to 9bff5a0 Compare April 24, 2026 21:08
@moosebuild
Copy link
Copy Markdown
Contributor

moosebuild commented Apr 24, 2026

Job Documentation, step Docs: sync website on 79ac28e wanted to post the following:

View the site here

This comment will be updated on new commits.

@moosebuild
Copy link
Copy Markdown
Contributor

moosebuild commented Apr 27, 2026

Job Coverage, step Generate coverage on 79ac28e wanted to post the following:

Framework coverage

4e53b3 #32833 79ac28
Total Total +/- New
Rate 85.87% 85.87% -0.00% 100.00%
Hits 132516 132544 +28 71
Misses 21798 21809 +11 0

Diff coverage report

Full coverage report

Modules coverage

Stochastic tools

4e53b3 #32833 79ac28
Total Total +/- New
Rate 90.68% 90.69% +0.02% 100.00%
Hits 8636 8615 -21 136
Misses 888 884 -4 0

Diff coverage report

Full coverage report

Full coverage reports

Reports

This comment will be updated on new commits.

@zachmprince zachmprince marked this pull request as ready for review April 28, 2026 19:45
Copy link
Copy Markdown
Member

@lindsayad lindsayad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Framework portion looks good. I'll let @grmnptr handle the stochastic tools review

Comment thread framework/include/samplers/Sampler.h Outdated
Comment thread framework/include/utils/MooseRandomPerturbation.h
@moosebuild
Copy link
Copy Markdown
Contributor

Job Test, step Results summary on 79ac28e wanted to post the following:

Framework test summary

Compared against 4e53b3b in job civet.inl.gov/job/3786076.

No change

Modules test summary

Compared against 4e53b3b in job civet.inl.gov/job/3786076.

No added tests

Run time changes

Test Base (s) Head (s) +/- Base (MB) Head (MB)
stochastic_tools/test:reporters/BFActiveLearning.sampling_bf/MultipleProc_MultipleRow_Ufunction 9.56 4.64 -51.47% 1295.51 310.88

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sampler::execute() is not called for EXEC_INITIAL, leaving executeSetUp() unrun on the first timestep

3 participants