chore: improve AGENTS.md and add copilot instructions

dacorvo · dacorvo · commit 8dee6088a8e1 · 2026-02-10T17:22:34.000+01:00
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
@@ -0,0 +1,7 @@
+# Optimum Neuron Copilot Instructions
+
+Use the root AGENTS.md as the single source of truth for onboarding, build, test, and architecture guidance:
+
+- [AGENTS.md](../AGENTS.md)
+
+The AGENTS.md file also links to subsystem-specific guides under [optimum/neuron/models/inference/AGENTS.md](../optimum/neuron/models/inference/AGENTS.md), [optimum/neuron/vllm/AGENTS.md](../optimum/neuron/vllm/AGENTS.md), and [tools/cache/AGENTS.md](../tools/cache/AGENTS.md).
diff --git a/AGENTS.md b/AGENTS.md
@@ -1,6 +1,6 @@
 # Optimum Neuron Agent Guide (Root)
 
-This repository bridges Hugging Face libraries with AWS Trainium/Inferentia. Use this file for project-wide guidance and see the model-specific guides:
+Optimum Neuron bridges Hugging Face libraries (Transformers, Diffusers, PEFT) with AWS Trainium/Inferentia accelerators. Use this file for project-wide guidance and the model-specific guides below:
 - Inference models guide: [optimum/neuron/models/inference/AGENTS.md](optimum/neuron/models/inference/AGENTS.md)
 - vLLM guide: [optimum/neuron/vllm/AGENTS.md](optimum/neuron/vllm/AGENTS.md)
 
@@ -18,7 +18,20 @@ This repository bridges Hugging Face libraries with AWS Trainium/Inferentia. Use
 
 ## Essential Developer Workflows
 
+### Virtual Environment (Required)
+
+Always activate the venv before any command; commands fail without it.
+
+```bash
+python3 -m venv .venv
+source .venv/bin/activate
+pip install -e ".[neuronx,tests]"
+```
+
 ### Testing (Neuron hardware required)
+
+Most tests require real Neuron hardware and will skip or fail on CPU-only machines.
+
 ```bash
 pytest tests/decoder/
 pytest tests/training/
@@ -42,6 +55,8 @@ optimum-cli export neuron \
   llama-neuron/
 ```
 
+Use separate processes for export and load to avoid Neuron device conflicts.
+
 ### Training Invocation
 ```bash
 NEURON_CC_FLAGS="--model-type transformer" torchrun \
@@ -71,3 +86,44 @@ For the full porting checklist and test guidance, see [optimum/neuron/models/inf
 
 ## Cache Management
 Compiled models are cached to the HF Hub. Test helpers live in [tests/conftest.py](tests/conftest.py). Relevant env vars: `NEURON_CC_FLAGS`, `NEURON_COMPILE_CACHE_URL`, `NEURON_RT_VISIBLE_CORES`.
+
+## CI/CD Workflows (Summary)
+
+All test workflows follow the same pattern:
+1. Checkout code
+2. Install Neuronx runtime (via `.github/actions/install_neuronx_runtime`)
+3. Prepare venv `aws_neuron_venv_pytorch` (via `.github/actions/prepare_venv`)
+4. Install `optimum-neuron[neuronx,tests]` (via `.github/actions/install_optimum_neuron`)
+5. Run pytest in the venv
+
+## Runtime Pitfalls
+
+- Static shapes: runtime input shapes must match compiled shapes.
+- Export and load in separate processes to avoid device conflicts.
+- Neuron runtime does not release devices reliably within the same process.
+- Decoder graph changes require cache prune when using the fixtures defined under `tests/fixtures/export_models.py`: `python tools/prune_test_models.py`.
+
+## Environment Variables
+
+- `HF_TOKEN`: Required for hub access in tests.
+- `NEURON_CC_FLAGS="--model-type transformer"`: Required for training compilation.
+- `NEURON_RT_VISIBLE_CORES`: Control visible NeuronCores.
+
+## Validation Checklist (Before PR)
+
+1. Activate venv: `source .venv/bin/activate`.
+2. Style check: `make style_check` (or `make style`).
+3. Run relevant tests:
+  - CPU export logic: `pytest tests/exporters/`
+  - INF2 decoder: `pytest tests/decoder/`
+  - TRN1 training: `pytest -m "is_trainium_test" tests/training/`
+4. Check model-specific AGENTS.md if you touched a model directory.
+
+## Troubleshooting
+
+- `ruff: command not found`: activate venv first.
+- `No module named 'neuronx_distributed'`: install extras with `pip install -e ".[neuronx]"`.
+- Tests failing on CPU: expected for most Neuron tests.
+- Compilation timeout: large models take 30-60 min, use `--timeout 0`.
+
+Trust these instructions and only search for more context if something is missing or incorrect.