Tep Scheduled server: replace poll(2) O(N)/tick with epoll/kqueue + persistent registration
The live-updates work (#41 / #44) needs a worker to hold a large number of long-lived WebSocket connections (each turbo_stream_from opens one). Tep::Server::Scheduled already has the right architecture for this — fiber-per-connection over a cooperative scheduler, prefork + SO_REUSEPORT for multicore — and it's the server the blog runs. The blocker to scale is the I/O multiplexer underneath it.
The bottleneck
Tep::Scheduler.poll_round (runtime/spinel/tep/scheduler.rb:123) is poll(2)-shaped: every tick it calls sphttp_poll_reset, loops over all parked fibers re-adding each fd (sphttp_poll_add per fiber, :135), then sphttp_poll_run. That's O(total connections) per scheduler pass, regardless of how many are actually readable. This is the classic c10k wall: poll/select are O(N); epoll/kqueue are O(ready). Phoenix/BEAM, Go's netpoller, and AnyCable-Go all use epoll/kqueue with persistent registration.
As written, the per-tick pollset rebuild dominates somewhere in the low thousands of connections per worker — well short of the tens-to-hundreds of thousands ("AnyCable-Go class") this needs.
Two secondary issues in the same path
- Tail-only dead-slot reclamation (
scheduler.rb:71-89) is tuned for FIFO request lifecycles. A large WebSocket population closes in arbitrary order, leaving dead holes that every O(N) scan (tick, poll_round, any_io_waiter) still walks.
- No preemption (
scheduler.rb:32-35 — Spinel has no implicit-yield Fiber::SchedulerInterface hook; yields are explicit). A long synchronous handler — e.g. an in-process live re-render/diff — stalls every other connection on that worker until it yields. This is the price of running the render in-process (the AnyCable-split-collapse win) and wants bounded-work / yield points in heavy handlers.
Proposed direction
Scope note
Realistic target is AnyCable-Go class (tens-to-hundreds of thousands/node). BEAM-class millions-on-one-node is out of reach without per-process-heap GC isolation Ruby semantics don't provide — and isn't needed here. The model is already Phoenix-shaped; only the I/O multiplexer is c10k-era.
Refs: runtime/spinel/tep/scheduler.rb, server_scheduled.rb, websocket/connection.rb (one fiber/conn recv loop), net.rb (sp_net_poll_*). Background: the live-updates transport discussion on #44.
🤖 Generated with Claude Code
Tep Scheduled server: replace poll(2) O(N)/tick with epoll/kqueue + persistent registration
The live-updates work (#41 / #44) needs a worker to hold a large number of long-lived WebSocket connections (each
turbo_stream_fromopens one).Tep::Server::Scheduledalready has the right architecture for this — fiber-per-connection over a cooperative scheduler, prefork +SO_REUSEPORTfor multicore — and it's the server the blog runs. The blocker to scale is the I/O multiplexer underneath it.The bottleneck
Tep::Scheduler.poll_round(runtime/spinel/tep/scheduler.rb:123) is poll(2)-shaped: every tick it callssphttp_poll_reset, loops over all parked fibers re-adding each fd (sphttp_poll_addper fiber, :135), thensphttp_poll_run. That's O(total connections) per scheduler pass, regardless of how many are actually readable. This is the classic c10k wall: poll/select are O(N); epoll/kqueue are O(ready). Phoenix/BEAM, Go's netpoller, and AnyCable-Go all use epoll/kqueue with persistent registration.As written, the per-tick pollset rebuild dominates somewhere in the low thousands of connections per worker — well short of the tens-to-hundreds of thousands ("AnyCable-Go class") this needs.
Two secondary issues in the same path
scheduler.rb:71-89) is tuned for FIFO request lifecycles. A large WebSocket population closes in arbitrary order, leaving dead holes that every O(N) scan (tick,poll_round,any_io_waiter) still walks.scheduler.rb:32-35— Spinel has no implicit-yieldFiber::SchedulerInterfacehook; yields are explicit). A long synchronous handler — e.g. an in-process live re-render/diff — stalls every other connection on that worker until it yields. This is the price of running the render in-process (the AnyCable-split-collapse win) and wants bounded-work / yield points in heavy handlers.Proposed direction
sp_net, exposed behind the existingSock.sphttp_poll_*façade (runtime/spinel/tep/net.rb:35-38) so call sites are unchanged.poll_roundshouldEPOLL_CTL_ADD/DELon park/unpark, not reset-and-rebuild every tick. Per-pass cost drops from O(total) to O(ready).scheduler.rbdeliberately keeps slot indices stable for captures held acrossFiber.yield— the replacement must preserve that).Scope note
Realistic target is AnyCable-Go class (tens-to-hundreds of thousands/node). BEAM-class millions-on-one-node is out of reach without per-process-heap GC isolation Ruby semantics don't provide — and isn't needed here. The model is already Phoenix-shaped; only the I/O multiplexer is c10k-era.
Refs:
runtime/spinel/tep/scheduler.rb,server_scheduled.rb,websocket/connection.rb(one fiber/conn recv loop),net.rb(sp_net_poll_*). Background: the live-updates transport discussion on #44.🤖 Generated with Claude Code