Open
Conversation
Cluster was a multiprocessing.Process subclass that wrapped FixedNativeWorkerAdapter in a subprocess with an asyncio event loop solely to handle signals. This intermediate process layer is removed; workers are now direct children of SchedulerClusterCombo. - Delete Cluster, ClusterConfig, and the cluster entry point module - Redirect scaler_cluster and run_cluster.py to the fixed native worker adapter entry point (with --num-of-workers alias for compat) - Add SIGINT/SIGTERM handling to the fixed native adapter entry point - Update SchedulerClusterCombo, tests, and examples to use FixedNativeWorkerAdapter directly - Update ECS adapter to use scaler_cluster with updated parameters (--max-workers replaces --num-of-workers/--worker-names; worker IDs are no longer pre-announced since workers self-assign UUIDs)
a9f8f10 to
b4ebe01
Compare
Refactor fixed native entry point to accept a configurable section name, so scaler_cluster reads [cluster] while scaler_worker_adapter_fixed_native continues to read [fixed_native_worker_adapter].
- Remove ClusterProcess/Worker startup console output from quickstart (no longer produced by FixedNativeWorkerAdapter) - Update worker_adapters/index: Fixed Native is used by SchedulerClusterCombo, not Cluster - Replace --num-of-workers/num_of_workers with canonical --max-workers/max_workers in configuration examples
fb59c21 to
575f361
Compare
Collaborator
|
|
Contributor
Author
|
@sharpener6, the worker names aren't actually used for anything other than computing the # of workers. I'm not sure if that was intended, but this PR doesn't lose any behavior: https://github.com/search?q=repo%3Afinos%2Fopengris-scaler+worker_names+path%3A%2F%5Esrc%5C%2Fscaler%5C%2Fcluster%5C%2Fcluster.py%2F&type=code |
Entry points were renamed from worker_adapter_* to worker_manager_*, and run_cluster.py now imports from the cluster entry point directly.
…cy/scaler into remove-cluster-class
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Clusterclass (multiprocessing.ProcesswrappingFixedNativeWorkerAdapter) and its associatedClusterConfig, eliminating an unnecessary intermediate subprocess layerSchedulerClusterCombo, tests, and examples updated to useFixedNativeWorkerAdapterdirectly; workers are now direct children of the combo process instead of grandchildrenscaler_clusterentry point andrun_cluster.pypreserved for backwards compatibility, redirecting to the fixed native worker adapter;--num-of-workersaccepted as an alias for--max-workersscaler_clusterwith corrected parameters (--max-workersreplaces--num-of-workers/--worker-names; worker IDs are no longer pre-announced since workers self-assign UUIDs)ClusterProcess/Worker[N]startup log output from quickstart, updated--num-of-workersreferences to canonical--max-workers, and corrected the Fixed Native adapter descriptionBreaking Changes
--worker-namesCLI flag dropped: Thescaler_clusterentry point no longer accepts--worker-names. Passing it will now result in an unrecognized argument error. This flag was previously accepted but had no effect on actual worker naming (workers were identified by generated IDs regardless). Users relying on it should simply remove it from their invocations.ClusterandClusterConfigremoved from public API:from scaler import Clusterandfrom scaler.config.section.cluster import ClusterConfigwill no longer work. UseFixedNativeWorkerAdapterandFixedNativeWorkerAdapterConfigdirectly.ECS adapter:
worker_idsremovalThe old ECS adapter pre-announced worker IDs in the
StartWorkerGroupresponse by generating random names (ECS|{uuid}) and computing IDs from them. However, ECS workers self-assign their own UUIDs on connect and are never told what names to use, so the pre-announced IDs never matched any real worker in the scheduler'sinformation_snapshot. As a result, the scheduler's load-based group selection on shutdown (which sumsqueued_tasksacross the group's worker IDs) always saw 0 tasks per group, making group selection arbitrary regardless.Sending
worker_ids=[]is functionally equivalent for all three scaling policies (vanilla,fixed_elastic,capability_scaling). ForCapabilityScalingControllerit is slightly more correct:workers_in_group = 0accurately reflects that no phantom workers are subtracted fromremaining_worker_countwhen evaluating shutdown safety guards. The underlying limitation — that the ECS adapter cannot do per-group load-informed shutdown selection — is inherent to the architecture and pre-dates this PR.Fixed native vs native
The fixed native (FN) and native adapters are quite similar, except that the FN adapter always spawns a fixed number of workers and does not support any API for scaling. I think that this makes the FN adapter incompatible with scaling controllers (managers?), and it also means that the FN entrypoint needs extra signal handling logic.
We could refactor the FN adapter to be more similar to the native adapter, however we have already discussed merging the two into one, and so I will refrain from this knowing that the FN adapter is likely to be replaced soon