Commit 7f1b521
committed
docs: Update spec to reflect Helm-based deployment and Trainium2 support
- Add Trainium2 requirements (Req 12) and model compilation requirement
- Add Helm chart deployment requirements (Req 11)
- Update model memory requirements (Req 13) for Scout/Maverick
- Update glossary with new terms (Helm, Trainium2, NxD Inference, etc.)
- Update tasks to reflect Helm-based workflow
- Add notes about FP8 quantization for GPU vs BF16 for Trainium21 parent 0e5c50f commit 7f1b521
File tree
4 files changed
+1230
-0
lines changed- .kiro/specs/llama4-inference-blueprint
4 files changed
+1230
-0
lines changed
0 commit comments