add llama.cpp to inference chart #188

omrishiv · 2025-10-02T21:06:53Z

What does this PR do?

🛑 Please open an issue first to discuss any significant work and flesh out details/direction. When we triage the issues, we will add labels to the issue like "Enhancement", "Bug" which should indicate to you that this issue can be worked on and we are looking forward to your PR. We would hate for your time to be wasted.
Consult the CONTRIBUTING guide for submitting pull-requests.

This PR adds a llama.cpp deployment template to the inference charts using Graviton instances. It also includes a Llama 3.2 1B model config for testing.

Motivation

#88 provides a more thorough example of autoscaling multiple Llama.cpp instances. This is for a single deployment

More

Yes, I have tested the PR using my local account setup (Provide any test evidence report under Additional Notes)
Mandatory for new blueprints. Yes, I have added a example to support my blueprint PR
Mandatory for new blueprints. Yes, I have updated the website/docs or website/blog section for this feature
Yes, I ran pre-commit run -a with this PR. Link for installing pre-commit locally

For Moderators

E2E Test successfully complete before merge?

Additional Notes

Signed-off-by: omrishiv <[email protected]>

omrishiv added 2 commits October 2, 2025 14:01

add llama.cpp to inference chart

49d68ab

Signed-off-by: omrishiv <[email protected]>

add graviton node pool

66ef8ae

Signed-off-by: omrishiv <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add llama.cpp to inference chart #188

add llama.cpp to inference chart #188

Uh oh!

omrishiv commented Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

add llama.cpp to inference chart #188

Are you sure you want to change the base?

add llama.cpp to inference chart #188

Uh oh!

Conversation

omrishiv commented Oct 2, 2025

What does this PR do?

Motivation

More

For Moderators

Additional Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant