Bring benchmarks up to paper quality: - [ ] Multiple trials with average and stddev - [ ] Warmup trials - [ ] Report memory using profiler instead of (unreliable) `process.memoryUsage().heapUsed`