Skip to content

Commit ffd73b5

Browse files
authored
Merge branch 'awslabs:main' into main
2 parents fa45d72 + f6a9f0d commit ffd73b5

File tree

227 files changed

+17138
-44
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

227 files changed

+17138
-44
lines changed

website/docs/blueprints/gateways/envoy-gateway.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
sidebar_label: Envoy Gateway implementation on EKS
33
---
4-
import CollapsibleContent from '../../../src/components/CollapsibleContent';
4+
import CollapsibleContent from '@site/src/components/CollapsibleContent';
55

66
# Envoy gateway
77

website/docs/blueprints/inference/framework-guides/Neuron/llama3-inf2.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ In this section, we will delve into the architecture of our solution, which comb
6868

6969
## Deploying the Solution
7070

71-
To get started with deploying `Llama-4-8b-instruct` on [Amazon EKS](https://aws.amazon.com/eks/), we will cover the necessary prerequisites and guide you through the deployment process step by step.
71+
To get started with deploying `Llama-3-8B-Instruct` on [Amazon EKS](https://aws.amazon.com/eks/), we will cover the necessary prerequisites and guide you through the deployment process step by step.
7272

7373
This includes setting up the infrastructure, deploying the **Ray cluster**, and creating the [Gradio](https://www.gradio.app/) WebUI app.
7474

@@ -154,7 +154,7 @@ To deploy the llama3-8B-Instruct model, it's essential to configure your Hugging
154154

155155

156156
```bash
157-
# set the Hugging Face Hub Token as an environment variable. This variable will be substituted when applying the ray-service-mistral.yaml file
157+
# set the Hugging Face Hub Token as an environment variable. This variable will be substituted when applying the ray-service-llama3.yaml file
158158

159159
export HUGGING_FACE_HUB_TOKEN=<Your-Hugging-Face-Hub-Token-Value>
160160

@@ -231,13 +231,13 @@ The Gradio app interacts with the locally exposed service created solely for the
231231
First, execute a port forward to the Llama-3 Ray Service using kubectl:
232232

233233
```bash
234-
kubectl port-forward svc/llama2-service 8000:8000 -n llama3
234+
kubectl port-forward svc/llama3 8000:8000 -n llama3
235235
```
236236

237237
## Deploying the Gradio WebUI App
238238
Discover how to create a user-friendly chat interface using [Gradio](https://www.gradio.app/) that integrates seamlessly with deployed models.
239239

240-
Let's move forward with setting up the Gradio app as a Docker container running on localhost. This setup will enable interaction with the Stable Diffusion XL model, which is deployed using RayServe.
240+
Let's move forward with setting up the Gradio app as a Docker container running on localhost. This setup will enable interaction with the Llama-3-8B Instruct model, which is deployed using RayServe.
241241

242242
### Build the Gradio app docker container
243243

website/docs/blueprints/training/GPUs/bionemo.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
sidebar_position: 1
33
sidebar_label: BioNeMo on EKS
44
---
5-
import CollapsibleContent from '../../../../src/components/CollapsibleContent';
5+
import CollapsibleContent from '@site/src/components/CollapsibleContent';
66

77
# BioNeMo on EKS
88

website/docs/blueprints/training/GPUs/slinky-slurm.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
sidebar_label: Slurm on EKS
33
---
44

5-
import CollapsibleContent from '../../../../src/components/CollapsibleContent';
5+
import CollapsibleContent from '@site/src/components/CollapsibleContent';
66

77
# Slurm on EKS
88

website/docs/blueprints/training/Neuron/Llama-LoRA-Finetuning.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
sidebar_label: Llama 3 Fine-tuning with LoRA
33
---
4-
import CollapsibleContent from '../../../../src/components/CollapsibleContent';
4+
import CollapsibleContent from '@site/src/components/CollapsibleContent';
55

66
:::warning
77
To deploy this example for fine-tuning a LLM on EKS, you need access to AWS Trainium ec2 instance. If deployment fails, check if you have access to this instance type. If nodes aren't starting, check Karpenter or Node group logs.

website/docs/blueprints/training/Neuron/Llama2.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Llama-2 with Nemo-Megatron on Trn1
33
sidebar_position: 2
44
description: Training a Llama-2 Model using Trainium, Neuronx-Nemo-Megatron and MPI operator
55
---
6-
import CollapsibleContent from '../../../../src/components/CollapsibleContent';
6+
import CollapsibleContent from '@site/src/components/CollapsibleContent';
77

88
:::warning
99
Deployment of ML models on EKS requires access to GPUs or Neuron instances. If your deployment isn't working, it’s often due to missing access to these resources. Also, some deployment patterns rely on Karpenter autoscaling and static node groups; if nodes aren't initializing, check the logs for Karpenter or Node groups to resolve the issue.

website/docs/blueprints/training/Neuron/RayTrain-Llama2.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
sidebar_position: 1
33
sidebar_label: Llama-2 with RayTrain on Trn1
44
---
5-
import CollapsibleContent from '../../../../src/components/CollapsibleContent';
5+
import CollapsibleContent from '@site/src/components/CollapsibleContent';
66

77
:::warning
88
Deployment of ML models on EKS requires access to GPUs or Neuron instances. If your deployment isn't working, it’s often due to missing access to these resources. Also, some deployment patterns rely on Karpenter autoscaling and static node groups; if nodes aren't initializing, check the logs for Karpenter or Node groups to resolve the issue.

website/docs/guidance/dynamic-resource-allocation.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -845,14 +845,14 @@ Standard GPU allocation without sharing - each workload gets exclusive access to
845845
<TabItem value="template" label="ResourceClaimTemplate">
846846

847847
<CodeBlock language="yaml" title="basic-gpu-claim-template.yaml" showLineNumbers>
848-
{require('!!raw-loader!../../../infra/jark-stack/examples/k8s-dra/basic/basic-gpu-claim-template.yaml').default}
848+
{require('!!raw-loader!@site/../infra/jark-stack/examples/k8s-dra/basic/basic-gpu-claim-template.yaml').default}
849849
</CodeBlock>
850850

851851
</TabItem>
852852
<TabItem value="pod" label="Basic Pod">
853853

854854
<CodeBlock language="yaml" title="basic-gpu-pod.yaml" showLineNumbers>
855-
{require('!!raw-loader!../../../infra/jark-stack/examples/k8s-dra/basic/basic-gpu-pod.yaml').default}
855+
{require('!!raw-loader!@site/../infra/jark-stack/examples/k8s-dra/basic/basic-gpu-pod.yaml').default}
856856
</CodeBlock>
857857

858858
</TabItem>
@@ -896,14 +896,14 @@ Time-slicing is a GPU sharing mechanism where multiple workloads take turns usin
896896
<TabItem value="template" label="ResourceClaimTemplate">
897897

898898
<CodeBlock language="yaml" title="timeslicing-claim-template.yaml" showLineNumbers>
899-
{require('!!raw-loader!../../../infra/jark-stack/examples/k8s-dra/timeslicing/timeslicing-claim-template.yaml').default}
899+
{require('!!raw-loader!@site/../infra/jark-stack/examples/k8s-dra/timeslicing/timeslicing-claim-template.yaml').default}
900900
</CodeBlock>
901901

902902
</TabItem>
903903
<TabItem value="pod" label="Pod Configuration">
904904

905905
<CodeBlock language="yaml" title="timeslicing-pod.yaml" showLineNumbers>
906-
{require('!!raw-loader!../../../infra/jark-stack/examples/k8s-dra/timeslicing/timeslicing-pod.yaml').default}
906+
{require('!!raw-loader!@site/../infra/jark-stack/examples/k8s-dra/timeslicing/timeslicing-pod.yaml').default}
907907
</CodeBlock>
908908

909909
</TabItem>
@@ -952,14 +952,14 @@ NVIDIA Multi-Process Service (MPS) is a GPU sharing technology that allows multi
952952
<TabItem value="template" label="ResourceClaimTemplate">
953953

954954
<CodeBlock language="yaml" title="mps-claim-template.yaml" showLineNumbers>
955-
{require('!!raw-loader!../../../infra/jark-stack/examples/k8s-dra/mps/mps-claim-template.yaml').default}
955+
{require('!!raw-loader!@site/../infra/jark-stack/examples/k8s-dra/mps/mps-claim-template.yaml').default}
956956
</CodeBlock>
957957

958958
</TabItem>
959959
<TabItem value="pod" label="Multi-Container Pod">
960960

961961
<CodeBlock language="yaml" title="mps-pod.yaml" showLineNumbers>
962-
{require('!!raw-loader!../../../infra/jark-stack/examples/k8s-dra/mps/mps-pod.yaml').default}
962+
{require('!!raw-loader!@site/../infra/jark-stack/examples/k8s-dra/mps/mps-pod.yaml').default}
963963
</CodeBlock>
964964

965965
</TabItem>
@@ -1008,14 +1008,14 @@ Multi-Instance GPU (MIG) is a hardware-level GPU partitioning technology availab
10081008
<TabItem value="template" label="ResourceClaimTemplate">
10091009

10101010
<CodeBlock language="yaml" title="mig-claim-template.yaml" showLineNumbers>
1011-
{require('!!raw-loader!../../../infra/jark-stack/examples/k8s-dra/mig/mig-claim-template.yaml').default}
1011+
{require('!!raw-loader!@site/../infra/jark-stack/examples/k8s-dra/mig/mig-claim-template.yaml').default}
10121012
</CodeBlock>
10131013

10141014
</TabItem>
10151015
<TabItem value="pod" label="MIG Pod">
10161016

10171017
<CodeBlock language="yaml" title="mig-pod.yaml" showLineNumbers>
1018-
{require('!!raw-loader!../../../infra/jark-stack/examples/k8s-dra/mig/mig-pod.yaml').default}
1018+
{require('!!raw-loader!@site/../infra/jark-stack/examples/k8s-dra/mig/mig-pod.yaml').default}
10191019
</CodeBlock>
10201020

10211021
</TabItem>

website/docs/infra/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ sidebar_label: Introduction
55

66
# Introduction
77

8-
The AIoEKS foundational infrastructure lives in the `infra/base` directory. This directory contains the base
8+
The AI on EKS foundational infrastructure lives in the `infra/base` directory. This directory contains the base
99
infrastructure and all its modules that allow composing an environment that supports experimentation, AI/ML training,
1010
LLM inference, model tracking, and more.
1111

website/docs/infra/inference/aibrix.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
sidebar_label: AIBrix on EKS
33
---
4-
import CollapsibleContent from '../../../src/components/CollapsibleContent';
4+
import CollapsibleContent from '@site/src/components/CollapsibleContent';
55

66
# AIBrix on EKS
77

0 commit comments

Comments
 (0)