Skip to content

Commit 421e946

Browse files
committed
docs(ai): trim external container docs
This commit makes the external container docs more concise.
1 parent c8166a7 commit 421e946

File tree

1 file changed

+45
-34
lines changed

1 file changed

+45
-34
lines changed

ai/orchestrators/models-config.mdx

+45-34
Original file line numberDiff line numberDiff line change
@@ -97,16 +97,17 @@ currently **recommended** models and their respective prices.
9797
Optional flags to enhance performance (details below).
9898
</ParamField>
9999
<ParamField path="url" type="string" optional="true">
100-
Optional URL and port where the model container or custom container manager software is running.
101-
[See External Containers](#external-containers)
100+
Optional URL and port where the model container or custom container manager
101+
software is running. [See External Containers](#external-containers)
102102
</ParamField>
103103
<ParamField path="token" type="string">
104-
Optional token required to interact with the model container or custom container manager software.
105-
[See External Containers](#external-containers)
104+
Optional token required to interact with the model container or custom
105+
container manager software. [See External Containers](#external-containers)
106106
</ParamField>
107107
<ParamField path="capacity" type="integer">
108-
Optional capacity of the model. This is the number of inference tasks the model can handle at the same time. This defaults to 1.
109-
[See External Containers](#external-containers)
108+
Optional capacity of the model. This is the number of inference tasks the
109+
model can handle at the same time. This defaults to 1. [See External
110+
Containers](#external-containers)
110111
</ParamField>
111112

112113
### Optimization Flags
@@ -153,33 +154,43 @@ are available:
153154
### External Containers
154155

155156
<Warning>
156-
This feature is intended for advanced users. Incorrect setup can lead to a
157-
lower orchestrator score and reduced fees. If external containers are used,
158-
it is the Orchestrator's responsibility to ensure the correct container with
159-
the correct endpoints is running behind the specified `url`.
157+
This feature is intended for **advanced** users. Misconfiguration can reduce
158+
orchestrator scores and earnings. Orchestrators are responsible for ensuring
159+
the specified `url` points to a properly configured and operational container
160+
with the correct endpoints.
160161
</Warning>
161162

162-
External containers can be for one model to stack on top of managed model containers,
163-
an auto-scaling GPU cluster behind a load balancer or anything in between. Orchestrators
164-
can use external containers to extend the models served or fully replace the AI Worker managed model containers
165-
using the [Docker client Go library](https://pkg.go.dev/github.com/docker/docker/client)
166-
to start and stop containers specified at startup of the AI Worker.
167-
168-
External containers can be used by specifying the `url`, `capacity` and `token` fields in the
169-
model configuration. The only requirement is that the `url` specified responds as expected to the AI Worker same
170-
as the managed containers would respond (including http error codes). As long as the container management software
171-
acts as a pass through to the model container you can use any container management software to implement the custom
172-
management of the runner containers including [Kubernetes](https://kubernetes.io/), [Podman](https://podman.io/),
173-
[Docker Swarm](https://docs.docker.com/engine/swarm/), [Nomad](https://www.nomadproject.io/), or custom scripts to
174-
manage container lifecycles based on request volume
175-
176-
177-
- The `url` set will be used to confirm a model container is running at startup of the AI Worker using the `/health` endpoint.
178-
Inference requests will be forwarded to the `url` same as they are to the managed containers after startup.
179-
- The `capacity` should be set to the maximum amount of requests that can be processed concurrently for the pipeline/model id (default is 1).
180-
If auto scaling containers, take care that the startup time is fast if setting `warm: true` because slow response time will
181-
negatively impact your selection by Gateways for future requests.
182-
- The `token` field is used to secure the model container `url` from unauthorized access and is strongly
183-
suggested to use if the containers are exposed to external networks.
184-
185-
We welcome feedback to improve this feature, so please reach out to us if you have suggestions to enable better experience running external containers.
163+
The
164+
[AI Worker](/ai/orchestrators/start-orchestrator#orchestrator-node-architecture)
165+
typically manages model containers automatically using a
166+
[Docker client](https://pkg.go.dev/github.com/docker/docker/client) to start and
167+
stop containers at startup. However, orchestrators with unique infrastructure
168+
needs can use external containers to extend or replace managed containers. These
169+
setups can range from individual models to more complex configurations, such as
170+
an auto-scaling GPU cluster behind a load balancer.
171+
172+
To configure external containers, include the `url`, `capacity`, and optionally
173+
the `token` fields in the model configuration.
174+
175+
- The `url` is used to confirm that the model container is running during AI
176+
Worker startup via the `/health` endpoint. After validation, inference
177+
requests are forwarded to the `url` for processing, just like with managed
178+
containers.
179+
- The `capacity` determines the maximum number of concurrent requests the
180+
container can handle, with a default value of 1. For auto-scaling setups, it
181+
is essential to ensure containers start quickly by setting `warm: true`
182+
because slow startups can negatively impact Gateway selection for future
183+
requests.
184+
- The `token` is an optional field used to secure the `url`. It is strongly
185+
recommended for protecting endpoints exposed to external networks from
186+
unauthorized access.
187+
188+
As long as the custom container management logic acts as a pass-through to the
189+
model container, orchestrators can use container management software like
190+
[Kubernetes](https://kubernetes.io/), [Podman](https://podman.io/),
191+
[Docker Swarm](https://docs.docker.com/engine/swarm/),
192+
[Nomad](https://www.nomadproject.io/), or custom scripts designed to manage
193+
container lifecycles based on request volume.
194+
195+
We welcome feedback to improve this feature, so please reach out to us if you
196+
have suggestions to enable better experience running external containers.

0 commit comments

Comments
 (0)