-
Notifications
You must be signed in to change notification settings - Fork 9
Description
Need one sample to explain what the total number defined within policy object means.
Does the following example help? Suppose that there is one LauncherPopulationPolicy object, containing one CountForLauncher. We restrict our attention to the LauncherConfig and Node mentioned there. Suppose that the CountForLauncher's LauncherCount is 2. Suppose that before any server-requesting Pods are created, the population policy controller creates the 2 launcher Pods. Next, suppose that some user creates a server-requesting Pod that refers to the same LauncherConfig and gets scheduled on the same Node. This does not call for the creation of another launcher Pod; the semantics in
llm-d-fast-model-actuation/api/fma/v1alpha1/launcherpopulationpolicy_types.go
Lines 38 to 40 in 886b603
// the number of launchers that should exist is the larger of // (a) what `PopulationPolicy` says for that pair, and // (b) the number needed to satisfy the server-requesting Pods. uses a max of two numbers for the count of launchers, not two disjoint pools of launchers. The dual-pods controller will connect that server-requesting Pod with one of the already-existing launcher Pods. After that association is made and the inference server is launched, suppose that some time later the LauncherCount is reduced to 0. That does not call for the deletion of the launcher Pod --- because the server-requesting Pod is still using it; that's part (b) of
llm-d-fast-model-actuation/api/fma/v1alpha1/launcherpopulationpolicy_types.go
Lines 38 to 40 in 886b603
// the number of launchers that should exist is the larger of // (a) what `PopulationPolicy` says for that pair, and // (b) the number needed to satisfy the server-requesting Pods.
OK, lets continue the scenario. Suppose all the 2 launcher pods are associated with existing 2 server-requesting pods, and the new server-requesting pod is created, what it should do? not call or call for the creation of another launcher pod?
Originally posted by @osswangxining in #195 (comment)