Skip to content

Use standard NumPy random number generators in metric spaces to limit RAM usage #179

@rhugonnet

Description

@rhugonnet

Right now the standard random number generators of NumPy: rng = np.random.default_rng(seed=) do not work when passed to ProbabilisticMetricSpace or RasterMetricSpace. Only the legacy ones do (equivalent of np.random.seed() now defined as np.random.RandomState), but they are probably not that useful in our case (we don't need to exactly reproduce random sampling from old scripts). And, the legacy versions leak a lot of memory when using a random choice without replacement, which is exactly what we use: numpy/numpy#14169.

So for instance, if we only want to use 10,000 samples from 1 billion for the variogram estimation, the legacy version will still create an array of 1 billion points in the background using tons of RAM 😅.

Will try to fix this at the same time as #178!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions