Skip to content

fix: Don't block heartbeat when all slots acquired#26

Merged
peasee merged 3 commits into
spiceai-52.5from
peasee/260414-dont-block-heartbeat
Apr 14, 2026
Merged

fix: Don't block heartbeat when all slots acquired#26
peasee merged 3 commits into
spiceai-52.5from
peasee/260414-dont-block-heartbeat

Conversation

@peasee
Copy link
Copy Markdown

@peasee peasee commented Apr 14, 2026

Problem

When all slots are acquired, the work loop awaits forever until a slot becomes free. If the current tasks take longer than the configured heartbeat interval, the scheduler will consider the executor dead.

This is only applicable for PullStaged task mode, as PushStaged has a separate heartbeater loop.

To fix this for PullStaged, add a short-circuit timeout if we spend too long waiting for a free slot. This ensures the executor connects to the scheduler at least every 15 seconds.

@peasee peasee self-assigned this Apr 14, 2026
@peasee peasee added the bug Something isn't working label Apr 14, 2026
Copilot AI review requested due to automatic review settings April 14, 2026 02:22
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR prevents pull-based executors (PullStaged) from starving scheduler heartbeats when all task slots are occupied by ensuring poll_work is still called periodically even if no semaphore permits are available.

Changes:

  • Add a bounded wait (SLOT_WAIT_TIMEOUT) around acquiring task slots so the loop can still call poll_work for heartbeat updates.
  • Refactor the polling loop into a helper (poll_loop_with_slots) to support injecting a semaphore (used by tests).
  • Add a regression test that holds all task-slot permits and asserts poll_work is still called.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ballista/executor/src/execution_loop.rs
Comment thread ballista/executor/src/execution_loop.rs Outdated
Comment thread ballista/executor/src/execution_loop.rs
Comment thread ballista/executor/src/execution_loop.rs
Comment thread ballista/executor/src/execution_loop.rs
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 14, 2026 02:34
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ballista/executor/src/execution_loop.rs
Comment thread ballista/executor/src/execution_loop.rs
Comment thread ballista/executor/src/execution_loop.rs
Comment thread ballista/executor/src/execution_loop.rs
@peasee peasee merged commit 729428c into spiceai-52.5 Apr 14, 2026
33 checks passed
@peasee peasee deleted the peasee/260414-dont-block-heartbeat branch April 14, 2026 02:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working needs-upstream

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants