perf: bindings implementation results in costly custom function calls on hot path 

Calling custom functions in a hot path (read: thousands of iterations) allocates a surprisingly large amount of memory for something that intuitively looks trivial / cheap to policy authors. I've added a [benchmark test case](https://github.com/open-policy-agent/opa/blob/682fbc1d589678873b0de6c2d69d284ccc0e3466/v1/rego/rego_bench_test.go#L74) in OPA previously, and while there are more factors at play here, the most obvious resource hog as far as memory utilization is concerned is the implementation of the bindings "array hashmap". Ironically, this implementation is itself an optimization, where an array is used for the first 16 items before switching over to a map. I have no reason to doubt that this may be an optimization in cases where many bindings need to be handled, but for a rule like the one in the example below — where only a single binding is actually used, this leaves us in a spot where each iteration allocates an array with room pre-allocated for 16 bindings, but only 1 is ever getting used!

```rego
refs contains value if {
    walk(input, [_, value]) # thousands of items
    is_ref(value)
}

is_ref(value) if value.type == "ref"
is_ref(value) if value[0].type == "ref"
```

Running `BenchmarkCustomFunctionInHotPath` with pprof's memory profiler enabled leaves no doubt as to where most of the cost is incurred. 

```
      flat  flat%   sum%        cum   cum%
    0.65GB 37.79% 37.79%     0.65GB 37.79%  github.com/open-policy-agent/opa/v1/topdown.(*bindingsArrayHashmap).Put
    0.27GB 15.69% 53.48%     1.48GB 85.73%  github.com/open-policy-agent/opa/v1/topdown.evalFunc.evalOneRule
    0.22GB 12.87% 66.35%     0.30GB 17.50%  github.com/open-policy-agent/opa/v1/topdown.evalFunc.evalOneRule.func1
    0.18GB 10.19% 76.53%     1.13GB 65.47%  github.com/open-policy-agent/opa/v1/topdown.(*eval).biunifyTermsRec
    0.13GB  7.76% 84.29%     1.70GB 98.09%  github.com/open-policy-agent/opa/v1/topdown.(*eval).biunifyArraysRec
    0.08GB  4.54% 88.83%     0.08GB  4.54%  github.com/open-policy-agent/opa/v1/topdown.newBindings (inline)
    0.06GB  3.50% 92.33%     1.71GB 98.60%  github.com/open-policy-agent/opa/v1/topdown.(*eval).evalStep
    0.05GB  2.91% 95.24%     1.71GB 98.60%  github.com/open-policy-agent/opa/v1/topdown.(*eval).evalExpr
    0.02GB   0.9% 96.14%     0.02GB   0.9%  github.com/open-policy-agent/opa/v1/topdown.evalFunc.evalCache
    0.01GB  0.68% 96.82%     0.01GB  0.68%  github.com/open-policy-agent/opa/v1/ast.(*trieTraversalResult).Add
```

In other words, we're allocating 650 megabytes for bindings where only 40 is needed / used. While this may be an extreme case, it's not a contrived one, and this was originally observed in Regal using quite real Rego :)

We should look into alternative implementations for bindings. Ideally one where we allocate exactly for what we need upfront (could the compiler tell us?) but if that's not possible, at least a much better ratio than our current one.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: bindings implementation results in costly custom function calls on hot path #7266

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

perf: bindings implementation results in costly custom function calls on hot path #7266

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions