You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Use the root AGENTS.md as the single source of truth for onboarding, build, test, and architecture guidance:
4
+
5
+
-[AGENTS.md](../AGENTS.md)
6
+
7
+
The AGENTS.md file also links to subsystem-specific guides under [optimum/neuron/models/inference/AGENTS.md](../optimum/neuron/models/inference/AGENTS.md), [optimum/neuron/vllm/AGENTS.md](../optimum/neuron/vllm/AGENTS.md), and [tools/cache/AGENTS.md](../tools/cache/AGENTS.md).
Copy file name to clipboardExpand all lines: AGENTS.md
+57-1Lines changed: 57 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Optimum Neuron Agent Guide (Root)
2
2
3
-
This repository bridges Hugging Face libraries with AWS Trainium/Inferentia. Use this file for project-wide guidance and see the model-specific guides:
3
+
Optimum Neuron bridges Hugging Face libraries (Transformers, Diffusers, PEFT) with AWS Trainium/Inferentia accelerators. Use this file for project-wide guidance and the model-specific guides below:
@@ -71,3 +86,44 @@ For the full porting checklist and test guidance, see [optimum/neuron/models/inf
71
86
72
87
## Cache Management
73
88
Compiled models are cached to the HF Hub. Test helpers live in [tests/conftest.py](tests/conftest.py). Relevant env vars: `NEURON_CC_FLAGS`, `NEURON_COMPILE_CACHE_URL`, `NEURON_RT_VISIBLE_CORES`.
- Static shapes: runtime input shapes must match compiled shapes.
102
+
- Export and load in separate processes to avoid device conflicts.
103
+
- Neuron runtime does not release devices reliably within the same process.
104
+
- Decoder graph changes require cache prune when using the fixtures defined under `tests/fixtures/export_models.py`: `python tools/prune_test_models.py`.
105
+
106
+
## Environment Variables
107
+
108
+
-`HF_TOKEN`: Required for hub access in tests.
109
+
-`NEURON_CC_FLAGS="--model-type transformer"`: Required for training compilation.
110
+
-`NEURON_RT_VISIBLE_CORES`: Control visible NeuronCores.
0 commit comments