Skip to content

Async disk access: thaw dependencies ahead of execute #7643

Open
@crusaderky

Description

@crusaderky

This is a follow-up to #4424.

One issue with unspilling (asynchronous or not) is that, whenever you start executing a task whose inputs need unspilling, you end up with CPU under-utilization, since you have a task in executing state that's actually busy doing I/O.
When unspilling becomes asynchronous, we gain the option to pre-load the dependencies from disk.

Barring very complex solutions where the worker state machine becomes unspill-aware (that would require a new "unspilling-inputs" task state in between ready and executing, a new asynchronous instruction to match, and a wealth of new transitions), I would like to suggest a simpler, greedy design.

Proposed design

When a task reaches the top of the ready or constrained heap, but can't transition immediately to executing, the worker state machine fires an async_get command (#4424 (comment)) to the SpillBuffer with the list of dependencies. This brings all inputs necessary for the task to the top of the LRU and out of disk.

The output of the command is discarded.
When Worker.execute finally runs, it will call async_get again, with exactly the same keys. If enough time has passed, all keys are now in fast. If they are still in the middle of unspilling, the SpillBuffer will just return a reference to the already-existing Futures.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions