Move CPU Bound Tasks off Tokio Threadpool

### Is your feature request related to a problem or challenge?

DataFusion currently performs CPU-bound work on the tokio runtime and this can lead to issues where IO tasks are starved and unable to adequately service the IO. Crucially this occurs long before the actual resources of the threadpool are exhausted, it is expected that if there is no CPU available that IO will be starved, what we want to avoid is IO getting throttled when there is resource available.

The accepted wisdom has been to run DataFusion in a separate threadpool and spawn IO off into a separate threadpool, as described [here](https://www.influxdata.com/blog/using-rustlangs-async-tokio-runtime-for-cpu-bound-tasks/). https://github.com/apache/datafusion/issues/12393 seeks to document this, and by extension https://github.com/apache/datafusion/pull/13424 and https://github.com/apache/datafusion/pull/13690 to provide examples on how to achieve this.

Unfortunately this approach comes with a lot of both explicit and implicit complexity, and makes it hard to integrate DataFusion into existing codebases making use of tokio.

### Describe the solution you'd like

The basic idea would be, **rather than move the IO off to a separate runtime away from the CPU bound tasks, instead wrap the CPU bound tasks so that they can't starve the runtime**.

Ultimately DF has three broad class of task:

1. Fully synchronous code - e.g. optimisation, function evaluation, planning, most physical operators
2. Mixed IO and CPU-bound - e.g. file IO, schema inference, etc...
3. Pure IO - catalog

The vast majority of code in DF falls into the fully synchronous bucket (1.), and so it is a natural response to want to special case the less common IO (2. / 3.) instead of the more common CPU-bound code. However, I'd like to posit that conceptually and practically **it is much simpler to dispatch the synchronous code than the asynchronous code**, and that the items falling into the second bucket are the most complex operators in DataFusion.

This would yield some quite compelling advantages:

* DataFusion could easily be integrated with existing tokio-based workloads
* Avoids the many footguns of CPU-bound async tasks

This could be achieved using the native primitives 

One way to achieve this would be to use [block_in_place](https://docs.rs/tokio/latest/tokio/task/fn.block_in_place.html), as unlike spawn_blocking it does not require a `'static` lifetime and can therefore borrow data from the invoking context.

That being said this does come with some risks:

* It might become hard to determine what is sufficiently CPU-bound to warrant spawning
* There would potentially be a lot of callsites that would need wrapping
* It would be important to use block_in_place only at the granularity of a data batch, to ensure overheads are amortized
* Applications relying on thread count to limit concurrency might need to put in place some additional concurrency controls (e.g. a semaphore)

An arguably better solution would be to dispatch the work using an actual CPU-threadpool such as rayon.

### Describe alternatives you've considered

We could instead move IO off to a separate runtime

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Move CPU Bound Tasks off Tokio Threadpool #13692

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Move CPU Bound Tasks off Tokio Threadpool #13692

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions