Skip to content

Remove WorkerGroup abstraction from scheduler#594

Draft
gxuu wants to merge 1 commit intofinos:mainfrom
gxuu:mar-remove-wg
Draft

Remove WorkerGroup abstraction from scheduler#594
gxuu wants to merge 1 commit intofinos:mainfrom
gxuu:mar-remove-wg

Conversation

@gxuu
Copy link
Contributor

@gxuu gxuu commented Mar 12, 2026

DO NOT REVIEW. THIS IS A WORK IN PROGRESS.

Replace WorkerGroupID with direct WorkerID tracking throughout the scheduler. Workers now carry worker_manager_id in heartbeats. The scheduler sees individual workers; only the ECS adapter maintains an internal translation layer.

Key changes:

  • Protocol: WorkerManagerHeartbeat uses max_workers instead of max_worker_groups/workers_per_group. Commands renamed to StartWorkers/ShutdownWorkers with worker_ids list.
  • Scaling policies: operate on managed_worker_ids (List[WorkerID]) and managed_worker_capabilities (Dict[str, int]) instead of WorkerGroupState/WorkerGroupCapabilities.
  • WorkerManagerController: tracks workers per manager source via _manager_worker_ids (Dict[bytes, Set[WorkerID]]).
  • WorkerController: tracks worker→manager mapping from heartbeats.
  • Adapters: native/symphony flattened to Dict[WorkerID, Worker].
  • Removed: WorkerGroupID, WorkerGroupInfo, WorkerGroupState, WorkerGroupCapabilities types.

Replace WorkerGroupID with direct WorkerID tracking throughout the
scheduler. Workers now carry worker_manager_id in heartbeats. The
scheduler sees individual workers; only the ECS adapter maintains an
internal translation layer (worker groups → ECS task ARNs).

Key changes:
- Protocol: WorkerManagerHeartbeat uses max_workers instead of
  max_worker_groups/workers_per_group. Commands renamed to
  StartWorkers/ShutdownWorkers with worker_ids list.
- Scaling policies: operate on managed_worker_ids (List[WorkerID])
  and managed_worker_capabilities (Dict[str, int]) instead of
  WorkerGroupState/WorkerGroupCapabilities.
- WorkerManagerController: tracks workers per manager source via
  _manager_worker_ids (Dict[bytes, Set[WorkerID]]).
- WorkerController: tracks worker→manager mapping from heartbeats.
- Adapters: native/symphony flattened to Dict[WorkerID, Worker].
  ECS keeps internal group→task_arn mapping invisible to scheduler.
- Removed: WorkerGroupID, WorkerGroupInfo, WorkerGroupState,
  WorkerGroupCapabilities types.

Signed-off-by: gxu <georgexu420@163.com>
@gxuu gxuu marked this pull request as draft March 12, 2026 01:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant