@inbounds not propagating correctly

`@inbounds` applied against the kernel function definition has no effect. 

Additionally, `@inbounds` does not propagate through function calls within a kernel, for example by calling `zip()`. 

The following benchmarks from https://github.com/torrance/AMDGPU-MWE/blob/main/inbounds.jl demonstrate the performance penalty. Note that the 3rd benchmark is likely doubly penalised since the call to `zip()` isn't inlined.

`function @inbounds` => `@inbounds` annotated at function definition
`internal @inbounds` => `@inbounds` annotated at lines with indexing operations
`using zip()` => using a `zip()` to iterate and index into arrays

```julia
Function @inbounds
BenchmarkTools.Trial: 18 samples with 1 evaluation.
 Range (min … max):  283.219 ms … 287.235 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     283.964 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   284.278 ms ± 874.447 μs  ┊ GC (mean ± σ):  0.10% ± 0.29%

            ▁█                                                   
  ▄▁▁▁▁▁▁▁▄▄██▁▁▁▁▄▄▁▁▄▁▁▁▄▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▁
  283 ms           Histogram: frequency by time          287 ms <

 Memory estimate: 6.21 MiB, allocs estimate: 406760.

Internal @inbounds
BenchmarkTools.Trial: 36 samples with 1 evaluation.
 Range (min … max):  141.340 ms … 141.616 ms  ┊ GC (min … max): 1.78% … 0.00%
 Time  (median):     141.471 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   141.469 ms ±  69.181 μs  ┊ GC (mean ± σ):  0.10% ± 0.42%

           ▃      ▃▃ ▃        ▃▃ █        ▃                      
  ▇▁▁▁▁▁▁▇▇█▁▇▇▁▁▁██▇█▁▁▇▇▁▁▁▁██▇█▇▁▁▁▁▇▁▁█▇▁▇▇▁▁▁▇▁▇▁▇▁▁▁▁▇▁▁▇ ▁
  141 ms           Histogram: frequency by time          142 ms <

 Memory estimate: 3.06 MiB, allocs estimate: 200490.

Using zip()
BenchmarkTools.Trial: 16 samples with 1 evaluation.
 Range (min … max):  318.848 ms … 319.049 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     318.942 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   318.950 ms ±  61.016 μs  ┊ GC (mean ± σ):  0.10% ± 0.28%

  ▁       ▁    ▁   █▁   ▁ ▁       ▁  ▁       ▁█        ▁     ▁▁  
  █▁▁▁▁▁▁▁█▁▁▁▁█▁▁▁██▁▁▁█▁█▁▁▁▁▁▁▁█▁▁█▁▁▁▁▁▁▁██▁▁▁▁▁▁▁▁█▁▁▁▁▁██ ▁
  319 ms           Histogram: frequency by time          319 ms <
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

@inbounds not propagating correctly #342

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

@inbounds not propagating correctly #342

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions