docs(ai): add remote AI worker instructions (#678)

ad-astra-video · rickstaa · web-flow · commit ace22c3cd34e · 2024-11-07T12:38:42.000+01:00
This commit adds documentation about running a AI remote worker to the documentation and updates the orchestrator `get-started` page accordingly.

---------

Co-authored-by: Rick Staa &lt;rick.staa@outlook.com&gt;
diff --git a/ai/orchestrators/ai-worker.mdx b/ai/orchestrators/ai-worker.mdx
@@ -0,0 +1,140 @@
+---
+title: Attach Remote AI Workers
+---
+
+## Introduction
+
+The **AI Worker** is a crucial component of the Livepeer AI network, responsible
+for performing AI inference tasks. It can be run as a separate process on
+compute machines distinct from the Orchestrator or combined with the
+Orchestrator on the same machine.
+
+## Key Setup Considerations
+
+- **Startup Configuration**: If you decide to use separate AI Workers, this
+  **must** be selected at the Orchestrator's startup. A combined Orchestrator
+  **cannot** simultaneously support remote AI Workers.
+- **Shared Configuration File**: Both the Orchestrator and AI Workers use the
+  `aiModels.json` file (see
+  [Configuring AI Models](/ai/orchestrators/models-config)).
+  - The **Orchestrator** uses `aiModels.json` to set model pricing.
+  - The **AI Worker** uses it to manage the runner containers for each model.
+
+## Remote AI Worker Setup
+
+<Warning>
+  When using experimental external runner containers, ensure they connect to the
+  AI Worker and not directly to the Orchestrator.
+</Warning>
+
+In a split configuration, the Orchestrator manages multiple AI Workers and
+allocates tasks based on the connected workers' capacity. Worker **capacity** is
+determined by the following formula:
+
+```
+runner container count per pipeline/model_id = capacity per pipeline/model_id
+```
+
+The **Orchestrator's capacity** is the sum of the capacities of all connected AI
+Workers. This setup enables flexible scaling of compute resources by adding or
+removing AI Workers as needed.
+
+### Launch Commands for Remote AI Worker
+
+Below are the launch commands for both the Orchestrator and AI Worker nodes.
+
+<Info>
+  For the full Orchestrator launch command, see [Start Your AI
+  Orchestrator](/ai/orchestrators/start-orchestrator).
+</Info>
+
+<Accordion title="Show Launch Commands">
+
+#### Orchestrator Command
+
+```bash
+docker run \
+    --name livepeer_ai_orchestrator \
+    ...
+    -orchestrator \
+    ...
+    -aiModels /root/.lpData/aiModels.json
+```
+
+#### AI Worker Command
+
+```bash
+docker run \
+    --name livepeer_ai_worker \
+    -aiWorker \
+    -orchAddr [Orchestrator service address] \
+    -orchSecret [Orchestrator secret key] \
+    -nvidia "all" \
+    -v 6 \
+    -aiModels /root/.lpData/aiModels.json \
+    -aiModelsDir /root/.lpData/models \
+    -testTranscoder=false \
+    -aiRunnerImage livepeer/ai-runner:latest # Optional
+```
+
+<Info>
+  **Pricing**: Prices in this example may vary and should be adjusted based on
+  market research and the operational costs of providing compute.
+</Info>
+
+</Accordion>
+
+### Configuration Files (`aiModels.json`)
+
+The `aiModels.json` file configures AI model parameters separately for the
+Orchestrator and the AI Worker, with each configuration tailored to the specific
+needs of that node.
+
+<Info>
+  For detailed guidance on configuring `aiModels.json` with advanced model
+  settings, see [Configuring AI Models](/ai/orchestrators/models-config).
+</Info>
+
+<Accordion title="Show Configuration Examples">
+
+#### Orchestrator Configuration
+
+```json
+[
+  {
+    "pipeline": "text-to-image",
+    "model_id": "SG161222/RealVisXL_V4.0_Lightning",
+    "price_per_unit": 4768371,
+    "pixels_per_unit": 1
+  }
+]
+```
+
+#### AI Worker Configuration
+
+```json
+[
+  {
+    "pipeline": "text-to-image",
+    "model_id": "SG161222/RealVisXL_V4.0_Lightning",
+    "warm": true,
+    "optimization_flags": {
+      "SFAST": true
+    }
+  }
+]
+```
+
+</Accordion>
+
+## Verifying Remote AI Worker Operation
+
+After starting your **remote AI Worker** node, you can verify it is operational
+by following the same inference test instructions used for the Orchestrator, as
+described in the
+[Orchestrator Confirmation Section](/ai/orchestrators/start-orchestrator#confirm-combined-ai-orchestrator-operation).
+
+<Note>
+  When accessing the AI Runner from a separate machine, replace `localhost` with
+  the **Worker Node's IP address** in the inference test instructions.
+</Note>
diff --git a/ai/orchestrators/models-config.mdx b/ai/orchestrators/models-config.mdx
@@ -2,14 +2,24 @@
 title: Configuring AI Models
 ---
 
-Before deploying your AI Orchestrator node on the Livepeer AI network, you must choose the AI models you want to serve for AI inference tasks. This guide assists in configuring these models. The following page, [Download AI Models](/ai/orchestrators/models-download), provides instructions for their download. For details on supported pipelines and models, refer to [Pipelines](/ai/pipelines/text-to-image).
+Before deploying your AI Orchestrator node on the Livepeer AI network, you must
+choose the AI models you want to serve for AI inference tasks. This guide
+assists in configuring these models. The following page,
+[Download AI Models](/ai/orchestrators/models-download), provides instructions
+for their download. For details on supported pipelines and models, refer to
+[Pipelines](/ai/pipelines/text-to-image).
 
 ## Configuration File Format
 
 Orchestrators specify supported AI models in an `aiModels.json` file, typically
 located in the `~/.lpData` directory. Below is an example configuration showing
 currently **recommended** models and their respective prices.
 
+<Info>
+  Pricing used in this example is subject to change and should be set
+  competitively based on market research and costs to provide the compute.
+</Info>
+
 ```json
 [
   {
@@ -94,8 +104,8 @@ currently **recommended** models and their respective prices.
 
 <Note>At this time, these flags are only compatible with **warm** models.</Note>
 
-The Livepeer AI pipelines offer a suite of optimization flags. These
-are designed to enhance the performance of **warm** models by either increasing
+The Livepeer AI pipelines offer a suite of optimization flags. These are
+designed to enhance the performance of **warm** models by either increasing
 **inference speed** or reducing **VRAM** usage. Currently, the following flags
 are available:
 
diff --git a/ai/orchestrators/start-orchestrator.mdx b/ai/orchestrators/start-orchestrator.mdx
@@ -3,10 +3,9 @@ title: Start your AI Orchestrator
 ---
 
 <Warning>
-  The Livepeer AI network is currently in its **Beta** stage and is
-  undergoing active development. Running it on the same machine as your main
-  Orchestrator or Gateway node may cause stability issues. Please proceed with
-  caution.
+  The Livepeer AI network is currently in its **Beta** stage and is undergoing
+  active development. Running it on the same machine as your main Orchestrator
+  or Gateway node may cause stability issues. Please proceed with caution.
 </Warning>
 
 The Livepeer AI network is not yet integrated into the main
@@ -22,9 +21,28 @@ are two methods to run the Livepeer AI network software:
 - **Pre-built Binaries**: Pre-built binaries are available if you prefer not to
   use Docker.
 
-## Start the AI Orchestrator
+## Orchestrator Node Architecture
 
-Please follow the steps below to start your Livepeer AI Orchestrator node:
+In the Livepeer AI network, orchestrator operations rely on two primary **node
+types**:
+
+- **Orchestrator**: Manages and routes incoming jobs to available compute
+  resources.
+- **Worker**: Performs the actual computation tasks.
+
+The simplest configuration combines both roles on a single machine, utilizing
+the machine's GPUs for AI inference tasks, where the orchestrator also functions
+as a worker (known as a **combined AI orchestrator**). In this setup, capacity
+is limited by the available GPUs and is set as
+`runner container count per pipeline/model_id = capacity per pipeline/model_id`.
+For expanded scalability, operators can deploy dedicated (remote) worker nodes
+that connect to the orchestrator, increasing overall compute capacity.
+Instructions for setting up remote workers are available on the
+[next page](/ai/orchestrators/ai-worker).
+
+## Start a Combined AI Orchestrator
+
+Please follow the steps below to start your **combined AI orchestrator** node.
 
 <Tabs>
     <Tab title="Use Docker (Recommended)">
@@ -49,7 +67,6 @@ Please follow the steps below to start your Livepeer AI Orchestrator node:
                 ```bash
                 docker pull livepeer/ai-runner:segment-anything-2
                 ```
-
             </Step>
             <Step title="Verify the AI Models are Available">
                 The Livepeer AI network leverages pre-trained AI models for inference tasks. Before launching the AI Orchestrator node, verify that the weights of these models are accessible on your machine. For more information, visit the [Download AI Models](/ai/orchestrators/models-download) page.
@@ -86,7 +103,8 @@ Please follow the steps below to start your Livepeer AI Orchestrator node:
                     -nvidia "all" \
                     -aiWorker \
                     -aiModels /root/.lpData/aiModels.json \
-                    -aiModelsDir ~/.lpData/models
+                    -aiModelsDir ~/.lpData/models \
+                    -aiRunnerImage livepeer/ai-runner:latest # OPTIONAL
                 ```
 
                 This command launches an **off-chain** AI Orchestrator node. While most of the commands are similar to those used when operating a Mainnet Transcoding Network Orchestrator node (explained in the [go-livepeer CLI reference](/references/go-livepeer/cli-reference)), there are a few **Livepeer AI** specific flags:
@@ -180,12 +198,11 @@ Please follow the steps below to start your Livepeer AI Orchestrator node:
                     -nvidia "all" \
                     -aiWorker \
                     -aiModels ~/.lpData/aiModels.json \
-                    -aiModelsDir ~/.lpData/models
+                    -aiModelsDir ~/.lpData/models \
+                    -aiRunnerImage livepeer/ai-runner:latest # OPTIONAL
                 ```
 
-                This command launches an **off-chain** AI Orchestrator node. While most of the commands are similar to those used when operating a Mainnet Transcoding Network Orchestrator node (explained in the [go-livepeer CLI reference](/references/go-livepeer/cli-reference)), there
-
-are a few **Livepeer AI** specific flags:
+                This command launches an **off-chain** AI Orchestrator node. While most of the commands are similar to those used when operating a Mainnet Transcoding Network Orchestrator node (explained in the [go-livepeer CLI reference](/references/go-livepeer/cli-reference)), there are a few **Livepeer AI** specific flags:
 
                 - `-aiWorker`: This flag enables the AI Worker functionality.
                 - `-aiModels`: This flag sets the path to the JSON file that contains the AI models.
@@ -214,48 +231,77 @@ are a few **Livepeer AI** specific flags:
 
 </Tabs>
 
-## Confirm the AI Orchestrator is Operational
-
-Once the Livepeer AI Orchestrator node is up and running, validate its
-operation by sending an AI inference request directly to the
-[ai-runner](https://hub.docker.com/r/livepeer/ai-runner) container. The most
-straightforward way to do this is through the
-[Swagger UI](https://fastapi.tiangolo.com/features/) interface, accessible at
-`http://localhost:8000/docs` if you have loaded the `text-to-image` pipeline.
-Note that other pipelines will have different endpoints.
-
-![Swagger UI interface](/images/ai/swagger_ui.png)
-
-<Steps>
-    <Step title="Access the Swagger UI">
-        Navigate to `http://localhost:8000/docs` in your web browser to open the Swagger UI interface.
-    </Step>
-    <Step title="Initiate an Inference Request">
-        Initiate an inference request to the `POST /text-to-image` endpoint by clicking the `Try it out` button. Use the following example JSON payload:
-
-        ```json
-        {
-            "prompt": "A cool cat on the beach."
-        }
-        ```
-
-        This request will instruct the AI model to generate an image based on the text in the `prompt` field.
-    </Step>
-    <Step title="Inspect the Inference Response">
-        If the AI Orchestrator node is functioning correctly, you should receive a response similar to the following:
-
-        ```json
-        {
-            "images": [
+## Verify Combined AI Orchestrator Operation
+
+Once your **combined Livepeer AI Orchestrator** node is running, verify that the
+worker is operational by sending an AI inference request directly to the
+[ai-runner](https://hub.docker.com/r/livepeer/ai-runner) container. You can
+either use the [Swagger UI](https://fastapi.tiangolo.com/features/) interface or
+a `curl` command for this check.
+
+<Tabs>
+    <Tab title="Use Swagger UI">
+        <Steps>
+            <Step title="Access the Swagger UI">
+                Open your web browser and navigate to `http://localhost:8000/docs` to access the Swagger UI interface.
+            </Step>
+            <Step title="Initiate an Inference Request">
+                In the Swagger UI, locate the `POST /text-to-image` endpoint and click the `Try it out` button. Use the following example JSON payload:
+
+                ```json
                 {
-                    "url": "data:image/png;base64,iVBORw0KGgoAA...",
-                    "seed": 2724904334
+                    "prompt": "A cool cat on the beach."
                 }
-            ]
-        }
-        ```
+                ```
+
+                This request will instruct the AI model to generate an image based on the text in the `prompt` field.
+            </Step>
+            <Step title="Inspect the Inference Response">
+                If the AI Orchestrator node is functioning correctly, you should receive a response similar to the following:
 
-        The `url` field contains the base64 encoded image generated by the AI model. To convert this image to a png, use a base64 decoder such as [Base64.guru](https://base64.guru/converter/decode/image/png).
-    </Step>
+                ```json
+                {
+                    "images": [
+                        {
+                            "url": "data:image/png;base64,iVBORw0KGgoAA...",
+                            "seed": 2724904334
+                        }
+                    ]
+                }
+                ```
 
-</Steps>
+                The `url` field contains the base64 encoded image generated by the AI model. To convert this image to PNG, use a base64 decoder such as [Base64.guru](https://base64.guru/converter/decode/image/png).
+            </Step>
+        </Steps>
+    </Tab>
+    <Tab title="Use curl Command">
+        <Steps>
+            <Step title="Send an Inference Request with curl">
+                Alternatively, you can use the `curl` command to test the AI inference capabilities directly. Run the following command, replacing `<WORKER_NODE_IP>` with the IP address of your worker node:
+
+                ```bash
+                curl -X POST "http://localost:8000/text-to-image" -H "Content-Type: application/json" -d '{"prompt": "A cool cat on the beach."}'
+                ```
+
+                This sends a POST request to the `text-to-image` endpoint on the worker node with the specified JSON payload.
+            </Step>
+            <Step title="Inspect the Response">
+                If the AI Worker node is functioning correctly, you should receive a response similar to this:
+
+                ```json
+                {
+                    "images": [
+                        {
+                            "url": "data:image/png;base64,iVBORw0KGgoAA...",
+                            "seed": 2724904334
+                        }
+                    ]
+                }
+                ```
+
+                As with the Swagger UI response, the `url` field contains a base64 encoded image that can be decoded into PNG format using a tool like [Base64.guru](https://base64.guru/converter/decode/image/png).
+            </Step>
+        </Steps>
+    </Tab>
+
+</Tabs>
diff --git a/mint.json b/mint.json
@@ -551,6 +551,7 @@
             "ai/orchestrators/models-config",
             "ai/orchestrators/models-download",
             "ai/orchestrators/start-orchestrator",
+            "ai/orchestrators/ai-worker",
             "ai/orchestrators/benchmarking",
             "ai/orchestrators/onchain"
           ]

Original file line number	Diff line number	Diff line change
`@@ -551,6 +551,7 @@`
`551`	`551`	`"ai/orchestrators/models-config",`
`552`	`552`	`"ai/orchestrators/models-download",`
`553`	`553`	`"ai/orchestrators/start-orchestrator",`
	`554`	`+ "ai/orchestrators/ai-worker",`
`554`	`555`	`"ai/orchestrators/benchmarking",`
`555`	`556`	`"ai/orchestrators/onchain"`
`556`	`557`	`]`