Skip to content

CosteGieF/ort-cloudflare-workers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

onnxruntime-web in Cloudflare Workers

Running ONNX models in Cloudflare Workers with onnxruntime-web — works around two fundamental workerd limitations.

The Problem

onnxruntime-web fails silently in Cloudflare Workers with no error output. Two root causes:

1. WebAssembly.compile() is blocked

workerd (the Cloudflare Workers runtime) blocks WebAssembly.compile() and WebAssembly.instantiate() with dynamic bytes:

Error: Wasm code generation disallowed by embedder

ORT's WASM loading path calls WebAssembly.compile() internally — this always fails.

2. import.meta.url is empty

ORT uses new URL("ort-wasm-simd-threaded.wasm", import.meta.url) to find the .wasm file.
In workerd, import.meta.url is "", so ORT throws:

Error: Failed to init WASM binary: cannot determine the script source URL.

Setting ort.env.wasm.wasmPaths avoids this error but doesn't fix the WebAssembly.compile() block.

The Solution

Cloudflare Workers has a [[rules]] system that pre-compiles .wasm files at deploy time (not runtime). Importing a .wasm file gives you a WebAssembly.Module — already compiled, no runtime WebAssembly.compile() call needed.

ORT's Emscripten-compiled factory checks for config.instantiateWasm before calling WebAssembly.compile(). Set this callback to use the pre-compiled module:

config.instantiateWasm = (imports, cb) => {
  const inst = new WebAssembly.Instance(preCompiledModule, imports);
  cb(inst, preCompiledModule);
  return inst.exports;
};

This is injected via a post-build patch in build.mjs.

How It Works

wrangler.toml
  [[rules]] type = "CompiledWasm" globs = ["**/*.wasm"]  ← .wasm → WebAssembly.Module
  [[rules]] type = "Data"         globs = ["**/*.onnx"]  ← .onnx → ArrayBuffer

build.mjs (esbuild + 3 patches):
  Patch 1: inject preamble
    import __ORT_WASM__ from "./ort-wasm-simd-threaded.wasm"   ← WebAssembly.Module
    import __MODEL_BYTES__ from "./model.onnx"                  ← ArrayBuffer

  Patch 2: inject instantiateWasm on Emscripten config
    Find: let CONFIG = { numThreads: N }
          ...
          FACTORY(CONFIG).then(
    Inject: CONFIG.instantiateWasm = (imports, cb) => {
              var inst = new WebAssembly.Instance(__ORT_WASM__, imports);
              cb(inst, __ORT_WASM__); return inst.exports;
            };

  Patch 3: kill dynamic import()
    workerd rejects dynamic import(variable) at module *analysis* time (script won't load).
    Replace: await import(variable) → await Promise.reject(...)

Performance

Measured with Silero VAD (2.3 MB ONNX model, 12 MB WASM):

Metric Value
Session create (cold start) ~424 ms
First inference ~36 ms
Average inference ~0.7 ms
Bundle size (gzip) 5.26 MB

The WASM module is compiled once per isolate instantiation. Re-use is instant.

Size Budget

Asset Raw Gzip
ort-wasm-simd-threaded.wasm 12.3 MB 2.98 MB
model.onnx (Silero VAD) 2.3 MB 1.85 MB
index.js (ORT + app) 2.2 MB 0.43 MB
Total 16.8 MB 5.26 MB

Cloudflare Workers limit: 10 MB (gzip). ✓

Usage

npm install
# Add your model.onnx to the project root
node build.mjs
wrangler deploy .worker/index.js --no-bundle

Files

File Purpose
build.mjs esbuild + 3 post-process patches
wrangler.toml [[rules]] for .wasm and .onnx
src/index.ts Worker using ORT

Compatibility

Tested with:

  • onnxruntime-web 1.24.1
  • wrangler 4.x
  • compatibility_date = "2026-01-01"

Why not ort.env.wasm.wasmBinary?

wasmBinary still calls WebAssembly.compile(bytes) internally — same error.
The instantiateWasm callback is the only escape hatch that bypasses compilation entirely.

References

About

Running onnxruntime-web ONNX models in Cloudflare Workers via pre-compiled WASM modules + instantiateWasm callback

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors