Skip to content
This repository was archived by the owner on Mar 21, 2026. It is now read-only.

docs: add AWS (EC2/SageMaker) deployment + benchmarking guide#3352

Merged
alvarobartt merged 3 commits intohuggingface:mainfrom
KOKOSde:docs/aws-deployment-benchmarks
Mar 21, 2026
Merged

docs: add AWS (EC2/SageMaker) deployment + benchmarking guide#3352
alvarobartt merged 3 commits intohuggingface:mainfrom
KOKOSde:docs/aws-deployment-benchmarks

Conversation

@KOKOSde
Copy link
Contributor

@KOKOSde KOKOSde commented Jan 31, 2026

Docs: add an AWS deployment + benchmarking guide for TGI.

  • Adds a tutorial page covering EC2 (Docker) and SageMaker real-time endpoints.
  • Includes a practical benchmarking section (what to measure + a warmup/load-test workflow).
  • Links the new page from the docs toctree.

Docs-only change.

- Add an AWS deployment tutorial for EC2 + SageMaker
- Fix SageMaker example indentation and link to the new guide
- Add the new guide to the docs toctree
@KOKOSde KOKOSde force-pushed the docs/aws-deployment-benchmarks branch from 8d9a09d to ae18647 Compare February 4, 2026 00:11
@KOKOSde
Copy link
Contributor Author

KOKOSde commented Mar 3, 2026

Quick ping on this one. Happy to update anything if you want changes.

@julien-c
Copy link
Member

julien-c commented Mar 7, 2026

lgtm but pinging @alvarobartt for a quick review

@tengomucho
Copy link
Collaborator

This adds a guide for deployment based on Sagemaker SDK v2 that has been deprecated by v3, and TGI itself has been deprecated, so I see little value in merging this PR.

Copy link
Member

@alvarobartt alvarobartt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @KOKOSde!

As my colleague @tengomucho mentioned, the AWS SageMaker SDK is now on v3.0 which means that the former v2.0 (used in this example) is deprecated; and on top Text Generation Inference (TGI) is on maintenance mode at the moment, meaning we won't be actively contributing to it anymore in favour of contributing to other Transformers-based inference engines as vLLM or SGLang! 🤗

image

Regardless of that, the PR looks good to me and I'd be happy to merge as it might still have value, but I'd add a couple things here and there as per the review below!

P.S. Apologies for missing this earlier and getting back to you just now 🙏🏻

@alvarobartt alvarobartt merged commit b4adbf2 into huggingface:main Mar 21, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants