-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
I just ran the current main branch locally in Chrome and looked at the DevTools -> Network tab, since the first iteration took quite a while to run (>10 minutes).
In total, we are transferring >2GB of data. See screenshot from the status bar of the network tab:
![]()
Clearly, the biggest reason are large ONNX runtime model weights, see this screenshot of the requests by size. However, some Wasm model (I think the transformers-js one) is also repeatedly downloaded:

This is going to be an issue for a couple of reasons:
- first iteration is super slow and shows no indication of progress or what it's doing
- users might get surprised/angry of data usage over metered connections
- users might run into OOMs or full disk space
- possibly hosting costs for us
In terms of solutions, I think we should (roughly in order of urgency / ease of implementation)
- Before starting the benchmark / on the status page, show a warning sentence, something like "Warning: Running this benchmark downloads large models and incurs network transfers in the order of 2GB of data. Only run with a fast network and not over a metered connection."
- Fix the duplicate download of Wasm model files (possibly related: Fix caching issue with Transformers.js workloads in incognito mode #10)
- Select smaller models, in particular for Transformers.js
- Can we download the models on the client ahead-of-time, before any iteration?
Metadata
Metadata
Assignees
Labels
No labels