Skip to content

performance improvements #17

@d-v-b

Description

@d-v-b

running this script: https://github.com/zarr-developers/cast-value.py/blob/main/examples/benchmarks/bench_numpy_vs_rust.py

will compare a python / NumPy implementation of cast_value against the implementation defined in this repo. On my AMD machine, I see numbers like this:

uv run examples/benchmarks/bench_numpy_vs_rust.py
Array size: 1,000,000 elements

Configuration                                    Impl   Throughput     Memory
-----------------------------------------------------------------------------
float64 -> float32 (simple narrowing)           numpy       6.4G/s     3.8 MB
float64 -> float32 (simple narrowing)            rust     972.0M/s     3.8 MB

float64 -> int32 (round nearest-even)           numpy     354.3M/s    16.2 MB
float64 -> int32 (round nearest-even)            rust     141.1M/s     3.8 MB

float64 -> float32 (round towards-zero)         numpy      18.8M/s    72.5 MB
float64 -> float32 (round towards-zero)          rust     119.7M/s     3.8 MB

float64 -> uint8 (clamp, SIMD path)             numpy     318.0M/s    17.2 MB
float64 -> uint8 (clamp, SIMD path)              rust       5.3G/s   976.9 KB

float64 -> int32 (scalar_map: NaN/Inf/-Inf)     numpy     228.9M/s    16.2 MB
float64 -> int32 (scalar_map: NaN/Inf/-Inf)      rust     172.5M/s     3.8 MB

in every case where we are slower than NumPy, we should see if there are optimization opportunities. memory usage looks good across the board though.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions