Skip to content

fsspec integration! #6

@martindurant

Description

@martindurant

Hello aiointerpreters!

fsspec is an IO library for local, archive and remote bytes storage, and used by a great number of other packages in the pydata ecosystem. Currently, many remote (high-latency) stores are implemented using asyncio on a single dedicated thread, which works great for many concurrent small-payload requests.

In the case that the bandwidth to the data is very high, workloads become compute bound for decryption and decompression of the HTTP stream. In such cases, compiled IO packages like obstore (rust) and apache-arrow's filesystem (c++) can show much better performance, mostly from thread parallelism.

Since py3.14 finally has to-level support for concurrent.interpreters, AND memoryview can be zero-copy passed between interpreters, I am thinking that ~4x improvements are available. I was thinking we could build something together starting with the code in this repo!

The wrinkle is, that thread-parallelism is not enough (because latency), each worker must also run an event loop taking batches of tasks OR we need to decide which tasks should run concurrently in asyncio (because they are small) and which need offloading.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions