Skip to content

feat(wasix): Improve module loading performance and API correctness #5526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

theduke
Copy link
Contributor

@theduke theduke commented Apr 23, 2025

This commit contains two notable changes:

  • The VirtualFile trait is extended with a new as_owned_buffer() method.
    This allows cheap copying of file data if the file is already fully in
    memory, and enables optimisations for code that is aware of this
    functionality by allowing zero-copy cloning.
    This is especially useful for data that is already mmaped.

    The new method has an empty default impl, but is implemented for
    appropriate files (ReadOnlyFile, webc volume files).

  • Modifies the Runtime::load_module method to take a new
    HashedModuleData struct

    This has multiple benefits:

    • HashedModuleData is safe by construction, thanks to private fields,
      so the hash it contains will always be valid for the module data.
      This removes a source of bugs due to passing a wrong hash for module
      data to module loaders etc.
    • HashedModuleData can be cheaply creating from a
      BinaryPackageCommand, preventing redundant cloning of data
    • The load_module method now receives owned data that is cheaply
      clonable.
      Previously it received a slice, which introduces the neccessity of
      cloning the data if it needs to be passed on to a different thread.

In combination this PR can significantly reduce memory usage for
high-concurrency scenarios.

Closes #5527

@theduke theduke requested a review from Arshia001 April 23, 2025 17:13
Copy link

promptless bot commented Apr 23, 2025

✅ No documentation updates required.

@theduke theduke force-pushed the optimise-wasm-compilation branch 2 times, most recently from 56c99ba to 7831689 Compare July 1, 2025 20:54
This commit contains two notable changes:

* The VirtualFile trait is extended with a new as_owned_buffer() method.
  This allows cheap copying of file data if the file is already fully in
  memory, and enables optimisations for code that is aware of this
  functionality by allowing zero-copy cloning.
  This is especially useful for data that is already mmaped.

  The new method has an empty default impl, but is implemented for
  appropriate files (ReadOnlyFile, webc volume files).

* Modifies the Runtime::load_module method to take a new
  `HashedModuleData` struct

  This has multiple benefits:
  - HashedModuleData is safe by construction, thanks to private fields,
    so the hash it contains will always be valid for the module data.
    This removes a source of bugs due to passing a wrong hash for module
    data to module loaders etc.
  - HashedModuleData can be cheaply creating from a
    BinaryPackageCommand, preventing redundant cloning of data
  - The load_module method now receives owned data that is cheaply
    clonable.
    Previously it received a slice, which introduces the neccessity of
    cloning the data if it needs to be passed on to a different thread.

In combination this PR can significantly reduce memory usage for
high-concurrency scenarios.
@theduke theduke force-pushed the optimise-wasm-compilation branch from 7831689 to 901db2b Compare July 2, 2025 08:52
@theduke theduke requested a review from Arshia001 July 2, 2025 08:52
Comment on lines +46 to +47
let data = wasmer_wasix::runtime::module_cache::HashedModuleData::new_sha256(wasm_bytes);
let module = runtime.load_module_sync(data)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not break the API like this.

We can simply create one function load_module_sync_from_hash and let load_module_sync call HashedModuleData::new_sha256 and then load_module_sync_from_hash


Box::pin(task)
}

/// Load a a Webassembly module, trying to use a pre-compiled version if possible.
///
/// Non-async version of [`Self::load_module`].
fn load_module_sync(&self, wasm: &[u8]) -> Result<Module, SpawnError> {
fn load_module_sync(&self, wasm: HashedModuleData) -> Result<Module, SpawnError> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create new method, don't break this one

Copy link
Contributor Author

@theduke theduke Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no real point in keeping API stability in WASIX, we often make breaking changes when it makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is for the basic usage of external API.
SWC is using this API as well, we are leaking internals of how things are implemented rather than having the API designed by "how things should be used".

Fixing this is a quite easy thing, so I don't see the point of breaking API when we can simply choose to not do it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WASIX: Improve module loading API correctness and performance
3 participants