Skip to content

Huge total network transfers (due to model weights mostly, but also some caching issue) #33

@danleh

Description

@danleh

I just ran the current main branch locally in Chrome and looked at the DevTools -> Network tab, since the first iteration took quite a while to run (>10 minutes).

In total, we are transferring >2GB of data. See screenshot from the status bar of the network tab:
Image

Clearly, the biggest reason are large ONNX runtime model weights, see this screenshot of the requests by size. However, some Wasm model (I think the transformers-js one) is also repeatedly downloaded:
Image

This is going to be an issue for a couple of reasons:

  • first iteration is super slow and shows no indication of progress or what it's doing
  • users might get surprised/angry of data usage over metered connections
  • users might run into OOMs or full disk space
  • possibly hosting costs for us

In terms of solutions, I think we should (roughly in order of urgency / ease of implementation)

  • Before starting the benchmark / on the status page, show a warning sentence, something like "Warning: Running this benchmark downloads large models and incurs network transfers in the order of 2GB of data. Only run with a fast network and not over a metered connection."
  • Fix the duplicate download of Wasm model files (possibly related: Fix caching issue with Transformers.js workloads in incognito mode #10)
  • Select smaller models, in particular for Transformers.js
  • Can we download the models on the client ahead-of-time, before any iteration?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions