You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: website/docs/blueprints/inference/framework-guides/Neuron/llama3-inf2.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -68,7 +68,7 @@ In this section, we will delve into the architecture of our solution, which comb
68
68
69
69
## Deploying the Solution
70
70
71
-
To get started with deploying `Llama-4-8b-instruct` on [Amazon EKS](https://aws.amazon.com/eks/), we will cover the necessary prerequisites and guide you through the deployment process step by step.
71
+
To get started with deploying `Llama-3-8B-Instruct` on [Amazon EKS](https://aws.amazon.com/eks/), we will cover the necessary prerequisites and guide you through the deployment process step by step.
72
72
73
73
This includes setting up the infrastructure, deploying the **Ray cluster**, and creating the [Gradio](https://www.gradio.app/) WebUI app.
74
74
@@ -154,7 +154,7 @@ To deploy the llama3-8B-Instruct model, it's essential to configure your Hugging
154
154
155
155
156
156
```bash
157
-
# set the Hugging Face Hub Token as an environment variable. This variable will be substituted when applying the ray-service-mistral.yaml file
157
+
# set the Hugging Face Hub Token as an environment variable. This variable will be substituted when applying the ray-service-llama3.yaml file
Discover how to create a user-friendly chat interface using [Gradio](https://www.gradio.app/) that integrates seamlessly with deployed models.
239
239
240
-
Let's move forward with setting up the Gradio app as a Docker container running on localhost. This setup will enable interaction with the Stable Diffusion XL model, which is deployed using RayServe.
240
+
Let's move forward with setting up the Gradio app as a Docker container running on localhost. This setup will enable interaction with the Llama-3-8B Instruct model, which is deployed using RayServe.
Copy file name to clipboardExpand all lines: website/docs/blueprints/training/Neuron/Llama-LoRA-Finetuning.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
2
sidebar_label: Llama 3 Fine-tuning with LoRA
3
3
---
4
-
import CollapsibleContent from '../../../../src/components/CollapsibleContent';
4
+
import CollapsibleContent from '@site/src/components/CollapsibleContent';
5
5
6
6
:::warning
7
7
To deploy this example for fine-tuning a LLM on EKS, you need access to AWS Trainium ec2 instance. If deployment fails, check if you have access to this instance type. If nodes aren't starting, check Karpenter or Node group logs.
Copy file name to clipboardExpand all lines: website/docs/blueprints/training/Neuron/Llama2.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ title: Llama-2 with Nemo-Megatron on Trn1
3
3
sidebar_position: 2
4
4
description: Training a Llama-2 Model using Trainium, Neuronx-Nemo-Megatron and MPI operator
5
5
---
6
-
import CollapsibleContent from '../../../../src/components/CollapsibleContent';
6
+
import CollapsibleContent from '@site/src/components/CollapsibleContent';
7
7
8
8
:::warning
9
9
Deployment of ML models on EKS requires access to GPUs or Neuron instances. If your deployment isn't working, it’s often due to missing access to these resources. Also, some deployment patterns rely on Karpenter autoscaling and static node groups; if nodes aren't initializing, check the logs for Karpenter or Node groups to resolve the issue.
Copy file name to clipboardExpand all lines: website/docs/blueprints/training/Neuron/RayTrain-Llama2.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
sidebar_position: 1
3
3
sidebar_label: Llama-2 with RayTrain on Trn1
4
4
---
5
-
import CollapsibleContent from '../../../../src/components/CollapsibleContent';
5
+
import CollapsibleContent from '@site/src/components/CollapsibleContent';
6
6
7
7
:::warning
8
8
Deployment of ML models on EKS requires access to GPUs or Neuron instances. If your deployment isn't working, it’s often due to missing access to these resources. Also, some deployment patterns rely on Karpenter autoscaling and static node groups; if nodes aren't initializing, check the logs for Karpenter or Node groups to resolve the issue.
0 commit comments