The Brilirs memory extension implementation is very slow

As was shown in https://github.com/sampsyo/bril/pull/189#issuecomment-1091979145. The current memory extension implementation for `brilirs` is surprisingly slow.

Currently, there are no good benchmarks that are memory intensive enough which makes it difficult to know what to optimize for. The closest program seems to be `test/mem/alloc_large.bril` which only allocates and frees the same sized chunk of memory in a loop.

The current implementation of `Heap` which supports the memory extension in `brilirs` is ported from `brili` and could be greatly improved. 

A couple of possible pain points include.
- The `Heap` is implemented as a map from the "base" of the pointer to a `Vec<Value>`. This is nice in that it grows and shrinks with the amount of memory in use at any given. However, the use of maps like `HashMap<usize, Vec<Value>>` has historically been a performance hit in `brilirs` due to needing to hash the key and perform the lookup.
  - Ideally we would compress the Heap into either a `Vec<Vec<Value>>` or a`Vec<Value>` with the "base" of the pointer just indexing into the start of it's allocation in the array. In either case we will want the ability to reuse indexes as allocations are freed up.
  - The implementation using `Vec<Vec<Value>>` is probably slower than the alternative because it will require two array access to read/write to a value. However, it's probably easier to implement.
  - `Vec<Value>` is more along the lines of implementing something like malloc which gives all of the performance benefits / fragmentation complexities involved. 
    - Unlike malloc however, one could implement a form of garbage collection and reassign the indexes of pointers since this implementation detail is not exposed to the user. It's not clear to me how this would affect performance.
  - Regardless of the implementation, the `Heap` needs to still enforce errors on buffer over/underflows, use after frees, use of uninitialized memory, and memory leaks.
- The current implementation allocates a new `Vec<Value>` on every call to alloc. Ideally, `brilirs` should be able to reuse previously freed allocations. This would especially target `alloc_large.bril` and potentially provide a significant improvement over `brili`.
  - Two possible solutions for this are either the previously mentioned `Heap` as just one large `Vec<Value>` or using something like a slab allocator and delegating this reuse to an external library.
- Unlike with stack allocation where it is statically known how many variables are in scope, memory allocation is dynamic which limits the interpreters ability to pre-allocate capacity for the `Heap` to use during the course of interpretation. 
  - `brilirs` already employs some static analysis for type checking and it would be interesting if it could use a data flow analysis to get a very rough estimate on whether it should start out with a small or large `Heap` size. Some programs don't use the memory extension or use it sparingly which is the default case. On the other hand, we also want to optimize for programs which make of intense use of memory.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The Brilirs memory extension implementation is very slow #190

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

The Brilirs memory extension implementation is very slow #190

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions