Running ONNX models in Cloudflare Workers with onnxruntime-web — works around two fundamental workerd limitations.
onnxruntime-web fails silently in Cloudflare Workers with no error output. Two root causes:
workerd (the Cloudflare Workers runtime) blocks WebAssembly.compile() and WebAssembly.instantiate() with dynamic bytes:
Error: Wasm code generation disallowed by embedder
ORT's WASM loading path calls WebAssembly.compile() internally — this always fails.
ORT uses new URL("ort-wasm-simd-threaded.wasm", import.meta.url) to find the .wasm file.
In workerd, import.meta.url is "", so ORT throws:
Error: Failed to init WASM binary: cannot determine the script source URL.
Setting ort.env.wasm.wasmPaths avoids this error but doesn't fix the WebAssembly.compile() block.
Cloudflare Workers has a [[rules]] system that pre-compiles .wasm files at deploy time (not runtime). Importing a .wasm file gives you a WebAssembly.Module — already compiled, no runtime WebAssembly.compile() call needed.
ORT's Emscripten-compiled factory checks for config.instantiateWasm before calling WebAssembly.compile(). Set this callback to use the pre-compiled module:
config.instantiateWasm = (imports, cb) => {
const inst = new WebAssembly.Instance(preCompiledModule, imports);
cb(inst, preCompiledModule);
return inst.exports;
};This is injected via a post-build patch in build.mjs.
wrangler.toml
[[rules]] type = "CompiledWasm" globs = ["**/*.wasm"] ← .wasm → WebAssembly.Module
[[rules]] type = "Data" globs = ["**/*.onnx"] ← .onnx → ArrayBuffer
build.mjs (esbuild + 3 patches):
Patch 1: inject preamble
import __ORT_WASM__ from "./ort-wasm-simd-threaded.wasm" ← WebAssembly.Module
import __MODEL_BYTES__ from "./model.onnx" ← ArrayBuffer
Patch 2: inject instantiateWasm on Emscripten config
Find: let CONFIG = { numThreads: N }
...
FACTORY(CONFIG).then(
Inject: CONFIG.instantiateWasm = (imports, cb) => {
var inst = new WebAssembly.Instance(__ORT_WASM__, imports);
cb(inst, __ORT_WASM__); return inst.exports;
};
Patch 3: kill dynamic import()
workerd rejects dynamic import(variable) at module *analysis* time (script won't load).
Replace: await import(variable) → await Promise.reject(...)
Measured with Silero VAD (2.3 MB ONNX model, 12 MB WASM):
| Metric | Value |
|---|---|
| Session create (cold start) | ~424 ms |
| First inference | ~36 ms |
| Average inference | ~0.7 ms |
| Bundle size (gzip) | 5.26 MB |
The WASM module is compiled once per isolate instantiation. Re-use is instant.
| Asset | Raw | Gzip |
|---|---|---|
ort-wasm-simd-threaded.wasm |
12.3 MB | 2.98 MB |
model.onnx (Silero VAD) |
2.3 MB | 1.85 MB |
index.js (ORT + app) |
2.2 MB | 0.43 MB |
| Total | 16.8 MB | 5.26 MB |
Cloudflare Workers limit: 10 MB (gzip). ✓
npm install
# Add your model.onnx to the project root
node build.mjs
wrangler deploy .worker/index.js --no-bundle| File | Purpose |
|---|---|
build.mjs |
esbuild + 3 post-process patches |
wrangler.toml |
[[rules]] for .wasm and .onnx |
src/index.ts |
Worker using ORT |
Tested with:
onnxruntime-web1.24.1wrangler4.xcompatibility_date = "2026-01-01"
wasmBinary still calls WebAssembly.compile(bytes) internally — same error.
The instantiateWasm callback is the only escape hatch that bypasses compilation entirely.