Skip to content

Commit b83a63d

Browse files
Wrote docs for inference_pool_gid (#2045)
1 parent 5065a63 commit b83a63d

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

docs-gb/user-guide/parallel-inference.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,13 @@ The expected values are:
7777
- `0`, will disable the parallel inference feature.
7878
In other words, inference will happen within the main MLServer process.
7979

80+
### `inference_pool_gid`
81+
82+
The `inference_pool_gid` field of the `model-settings.json` file (or alternatively, the `MLSERVER_MODEL_INFERENCE_POOL_GID` global environment variable) allows to load models on a dedicated inference pool based on the group ID (GID) to prevent starvation behavior.
83+
84+
Complementing the `inference_pool_gid`, if the `autogenerate_inference_pool_gid` field of the `model-settings.json` file (or alternatively, the `MLSERVER_MODEL_AUTOGENERATE_INFERENCE_POOL_GID` global environment variable) is set to `True`, a UUID is automatically generated, and a dedicated inference pool will load the given model. This option is useful if the user wants to load a single model on an dedicated inference pool without having to manage the GID themselves.
85+
86+
8087
## References
8188

8289
Jiale Zhi, Rui Wang, Jeff Clune, and Kenneth O. Stanley. Fiber: A Platform for Efficient Development and Distributed Training for Reinforcement Learning and Population-Based Methods. arXiv:2003.11164 [cs, stat], March 2020. [arXiv:2003.11164](https://arxiv.org/abs/2003.11164).

0 commit comments

Comments
 (0)