Skip to content

Conversation

@dmitry123
Copy link
Member

No description provided.

bfdays and others added 24 commits October 6, 2025 17:54
…port stack values; perf graphs (flamgraph) to measure perfomance; generic impl for reusable pools (for ValueStack and CallStack); improved hashmaps perf by changing hash builder to a faster one; rewrote RwasmStore.(tables|global_variables) onto Vec for perfomance; unix-speciific memory pool optimization (#53)

* feat(segment-caching): simple BitVecInlined wrapper for BitVec

* feat(segment-caching): fix std error

* feat(segment-caching): fix bug, extend unit-test

* feat(segment-caching): implemented stack based BitVec (BitVecInlined) replacement and integrated into RwasmStore; measured perfomance of fibbonachi app execution on rwasm with perf and visualized with flamegraph; implemented benchmarks for BitVecInlined and fibonnachi app execution using

* feat(segment-caching): renamings for clarity

* feat(segment-caching): added load perfomance graph

* feat(segment-caching): fix std problem

* feat(segment-caching): updated flamegraphs

* feat(rwasm): implemented&integrated reusable pool for ValueStack

* feat(rwasm): implemented generic reusable pool and integrated for ValueStack and CallStack

* feat(rwasm): cleanup

* feat(rwasm): cleanup

* feat(rwasm): move reusable pools from RwasmStore to ExecutionEngineInner

* feat(rwasm): fix no-std issues + cleanup

* feat(rwasm): optimise perfomance of ValueStackPtr by making it repr(transparent)

* feat(rwasm): improve hashmap perf by replacing hash build with fnv::FnvBuildHasher; do not create elements with capacity in TableEntity for perfomance

* feat(rwasm): rewrote RwasmStore.tables onto Vec instead of hashmap for perfomance

* feat(rwasm): fix no-std issue

* feat(rwasm): rewrote RwasmStore.global_variables onto Vec instead of hashmap for perfomance

* feat(rwasm): cleanup

* feat(rwasm): optimised work with store.global_variables and store.tables for perfomance; added benches for Hashmap and Vec to compare perf

* feat(rwasm): fix bench

* feat(rwasm): perf fix

* feat(rwasm): cleanup

* feat(rwasm): added make test

* feat(rwasm): uncommented bitvec_inlined; cleanup

* feat(rwasm): fixes

* feat(rwasm): cover bitvec_inlined with feature

* feat(rwasm): fixed bug for RwasmStore.global_variables processing

* feat(rwasm): fixes to benches

* feat(rwasm): cleanup

* feat(rwasm): cleanup

* chore: add evm benchmarks

* feat(rwasm): removed 1 flame graph

* chore: add benchamrks for fib32/fib64

* feat(rwasm): added perf graph for bench_evm

* chore: fix running fib32 bench

* feat(rwasm): added perf graph for fib64

* feat(rwasm): added perf for fib64

* feat(rwasm): perf graph for fib64

* feat(rwasm): fix perf graph

* feat(rwasm): inline for some new alu methods; fix bench collission

* fix: fix running tests, optimized reusable stacks, put tests related functionality from value stack under flag

* feat(rwasm): fix bench group name

* chore: tiny fixes

* fix: add fib256 benches, fix missing virtual stack alloc for benches, fix project compilation in tracer mode, fix typo with global memory init with max possible capacity, fixed module serialization, add evm machine executor, add trace extractor for fib32, fib64, fib64 tests

* feat(rwasm): global memory implementation using unix low level optimisations

* feat(rwasm): renamings

* feat(rwasm): fix feature name

* feat(rwasm): file renaming

* feat(rwasm): added fib32 test as regular test for trials

---------

Co-authored-by: Dmitry Savonin <[email protected]>
…58)

* feat(rwasm-optimisation): global memory reusability

* feat(rwasm-optimisation): recycling global memory v1 (RwasmStore)

* feat(rwasm-optimisation): warmup for recycling global memory

* feat(rwasm-optimisation): small fixes

* feat(rwasm-optimisation): support fallback to BytesMut-based GlobalMemory in case if memory pool exhahusted

* feat(rwasm-optimisation): uncomment benches

* feat(rwasm-optimisation): renamings+ordering; fix probolem with fallback strategy

* feat(rwasm-optimisation): improved reusable pool; forked setjmp crate and removed check

* fix(pooilng-memory): enabled pooling memory by default, remove guard memory pages

* bench: add new benches for module parsing and compilation

---------

Co-authored-by: Dmitry Savonin <[email protected]>
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@v4...v5)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [dawidd6/action-download-artifact](https://github.com/dawidd6/action-download-artifact) from 3 to 11.
- [Release notes](https://github.com/dawidd6/action-download-artifact/releases)
- [Commits](dawidd6/action-download-artifact@v3...v11)

---
updated-dependencies:
- dependency-name: dawidd6/action-download-artifact
  dependency-version: '11'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ry only for unix; rewrote some benches using iter_batched strategy (#60)

* feat(rwasm-optimisation): turn on pooling-allocator; enable unix-memory only for unix

* feat(rwasm-optimisation): reuse iter_batched for all possible benches

* feat(rwasm-optimisation): perf graph for fib32 on wasmtime

* feat(rwasm-optimisation): rollback iter_batched to iter for bench_strategy

* chore: tiny dep fix

---------

Co-authored-by: Dmitry Savonin <[email protected]>
* fix: don't store acquired call/value stack inside engine

* feat(vm): implement reusable memories/stack, fix bug with clearing on-demand memory, add engine config with memory allocator config, tiny engine refactoring
* fix: default config for shared engine

* chore: warning fixes
- Update all workspace members to reference fluent-rwasm via package alias
- Update Makefile test commands to use new package name
- Add trailing newlines to Cargo.toml files for consistency
chore: rename to fluent-rwasm and add crates.io publishing
@codecov
Copy link

codecov bot commented Nov 27, 2025

Codecov Report

❌ Patch coverage is 80.35427% with 122 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/vm/memory/simple.rs 0.00% 61 Missing ⚠️
src/types/bitvec_inlined.rs 81.86% 33 Missing ⚠️
src/vm/memory/mmap.rs 91.39% 13 Missing ⚠️
src/vm/value_stack.rs 50.00% 7 Missing ⚠️
src/vm/store.rs 63.63% 4 Missing ⚠️
src/vm/engine.rs 97.18% 2 Missing ⚠️
src/vm/engine/memories.rs 95.34% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@github-actions
Copy link

Criterion results (vs baseline)


running 2 tests
test tests::fib32_rwasm_test ... ignored
test tests::fib32_wasmtime_test ... ignored

test result: ok. 0 passed; 0 failed; 2 ignored; 0 measured; 0 filtered out; finished in 0.00s

Comparisons/bitvec      time:   [18.315 µs 18.323 µs 18.332 µs]
Found 22 outliers among 200 measurements (11.00%)
  2 (1.00%) low severe
  4 (2.00%) low mild
  12 (6.00%) high mild
  4 (2.00%) high severe
Comparisons/bitvec_inlined
                        time:   [3.4245 µs 3.4254 µs 3.4263 µs]
Found 31 outliers among 200 measurements (15.50%)
  8 (4.00%) low severe
  3 (1.50%) low mild
  8 (4.00%) high mild
  12 (6.00%) high severe
Comparisons/bitvec_inlined (half of inline store)
                        time:   [1.8640 µs 1.8642 µs 1.8645 µs]
Found 27 outliers among 200 measurements (13.50%)
  9 (4.50%) low severe
  3 (1.50%) low mild
  7 (3.50%) high mild
  8 (4.00%) high severe
RwasmModule {
 .function_begin_0 (#0)
  0000: StackCheck(5)
  0001: I32Const(1)
  0002: MemoryGrow
  0003: Drop
  0004: I32Const(0)
  0005: I32Const(0)
  0006: I32Const(80)
  0007: MemoryInit(0)
  0008: DataDrop(1)
  0009: ReturnCallInternal(10)
 .function_end

 .function_begin_10 (#1)
  0010: SignatureCheck(0)
  0011: ConsumeFuel(4)
  0012: StackCheck(4)
  0013: LocalGet(1)
  0014: LocalGet(1)
  0015: I32Load(4)
  0016: LocalGet(2)
  0017: I32Load(0)
  0018: LocalSet(2)
  0019: LocalGet(2)
  0020: LocalSet(3)
  0021: LocalGet(1)
  0022: LocalSet(2)
  0023: Drop
  0024: Return
 .function_end

 .ro_data: [61, 62, 63, 64, 65, 66, 67, 68, 69, 6a, 6b, 6c, 6d, 6e, 6f, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 7a, 61, 62, 63, 64, 65, 66, 67, 68, 69, 6a, 6b, 6c, 6d, 6e, 6f, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 7a, 61, 62, 63, 64, 65, 66, 67, 68, 69, 6a, 6b, 6c, 6d, 6e, 6f, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 7a, 61, 62],
 .ro_elem: [],
}

Comparisons/bitvec_inlined (through ExecutionEngine)
                        time:   [3.3689 ms 3.3697 ms 3.3706 ms]
Found 17 outliers among 200 measurements (8.50%)
  10 (5.00%) low mild
  5 (2.50%) high mild
  2 (1.00%) high severe

Comparisons Module Compilation/bench_evm
                        time:   [346.33 ns 346.62 ns 346.95 ns]
Found 9 outliers among 200 measurements (4.50%)
  1 (0.50%) low mild
  5 (2.50%) high mild
  3 (1.50%) high severe
Comparisons Module Compilation/bench_wasmtime
                        time:   [4.7918 ms 4.8189 ms 4.8537 ms]
Found 8 outliers among 200 measurements (4.00%)
  8 (4.00%) high severe
Comparisons Module Compilation/bench_wasmi
                        time:   [19.282 µs 19.297 µs 19.315 µs]
Found 19 outliers among 200 measurements (9.50%)
  1 (0.50%) low mild
  7 (3.50%) high mild
  11 (5.50%) high severe
Comparisons Module Compilation/bench_rwasm
                        time:   [26.406 µs 26.421 µs 26.439 µs]
Found 28 outliers among 200 measurements (14.00%)
  6 (3.00%) low severe
  4 (2.00%) low mild
  4 (2.00%) high mild
  14 (7.00%) high severe

Comparisons Fib256/bench_native
                        time:   [41.294 ns 41.301 ns 41.309 ns]
Found 32 outliers among 200 measurements (16.00%)
  14 (7.00%) low severe
  4 (2.00%) low mild
  4 (2.00%) high mild
  10 (5.00%) high severe
Comparisons Fib256/bench_evm
                        time:   [23.560 µs 23.597 µs 23.635 µs]
Found 2 outliers among 200 measurements (1.00%)
  2 (1.00%) high mild
Comparisons Fib256/bench_wasmtime
                        time:   [3.3238 µs 3.3265 µs 3.3292 µs]
Found 23 outliers among 200 measurements (11.50%)
  8 (4.00%) low severe
  2 (1.00%) low mild
  7 (3.50%) high mild
  6 (3.00%) high severe
Comparisons Fib256/bench_wasmi
                        time:   [6.5637 µs 6.5647 µs 6.5659 µs]
Found 15 outliers among 200 measurements (7.50%)
  5 (2.50%) low mild
  5 (2.50%) high mild
  5 (2.50%) high severe
Comparisons Fib256/bench_rwasm
                        time:   [5.2846 µs 5.2862 µs 5.2878 µs]
Found 17 outliers among 200 measurements (8.50%)
  4 (2.00%) low severe
  4 (2.00%) low mild
  4 (2.00%) high mild
  5 (2.50%) high severe

Comparisons fib32/bench_native
                        time:   [6.8837 ns 6.8855 ns 6.8874 ns]
Found 20 outliers among 200 measurements (10.00%)
  1 (0.50%) low severe
  8 (4.00%) low mild
  7 (3.50%) high mild
  4 (2.00%) high severe
Comparisons fib32/bench_evm
                        time:   [24.684 µs 24.700 µs 24.716 µs]
Found 18 outliers among 200 measurements (9.00%)
  2 (1.00%) low mild
  10 (5.00%) high mild
  6 (3.00%) high severe
Comparisons fib32/bench_wasmtime
                        time:   [3.2753 µs 3.2782 µs 3.2812 µs]
Found 24 outliers among 200 measurements (12.00%)
  8 (4.00%) low severe
  5 (2.50%) low mild
  4 (2.00%) high mild
  7 (3.50%) high severe
Comparisons fib32/bench_wasmi
                        time:   [2.0604 µs 2.0609 µs 2.0614 µs]
Found 18 outliers among 200 measurements (9.00%)
  5 (2.50%) low severe
  3 (1.50%) low mild
  4 (2.00%) high mild
  6 (3.00%) high severe
Comparisons fib32/bench_rwasm
                        time:   [1.4618 µs 1.4625 µs 1.4633 µs]
Found 21 outliers among 200 measurements (10.50%)
  2 (1.00%) low severe
  2 (1.00%) low mild
  15 (7.50%) high mild
  2 (1.00%) high severe

Comparisons fib64/bench_native
                        time:   [6.9738 ns 6.9753 ns 6.9768 ns]
Found 22 outliers among 200 measurements (11.00%)
  8 (4.00%) low severe
  3 (1.50%) low mild
  8 (4.00%) high mild
  3 (1.50%) high severe
Comparisons fib64/bench_evm
                        time:   [22.944 µs 22.961 µs 22.978 µs]
Found 5 outliers among 200 measurements (2.50%)
  5 (2.50%) high mild
Comparisons fib64/bench_wasmtime
                        time:   [3.2674 µs 3.2703 µs 3.2732 µs]
Found 27 outliers among 200 measurements (13.50%)
  8 (4.00%) low severe
  4 (2.00%) low mild
  7 (3.50%) high mild
  8 (4.00%) high severe
Comparisons fib64/bench_wasmi
                        time:   [2.0991 µs 2.0995 µs 2.1000 µs]
Found 20 outliers among 200 measurements (10.00%)
  4 (2.00%) low severe
  1 (0.50%) low mild
  10 (5.00%) high mild
  5 (2.50%) high severe
Comparisons fib64/bench_rwasm
                        time:   [4.3704 µs 4.3717 µs 4.3732 µs]
Found 19 outliers among 200 measurements (9.50%)
  3 (1.50%) low severe
  6 (3.00%) low mild
  2 (1.00%) high mild
  8 (4.00%) high severe

Comparisons Module Parsing/bench_evm
                        time:   [310.69 ps 310.80 ps 310.94 ps]
Found 26 outliers among 200 measurements (13.00%)
  9 (4.50%) low severe
  1 (0.50%) low mild
  8 (4.00%) high mild

Heads-up: runner perf is noisy; treat deltas as a smoke check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants