feat(tier-map): add JETSON_ORIN_NANO tier and jetson backend#1481
Open
matedev01 wants to merge 2 commits into
Open
feat(tier-map): add JETSON_ORIN_NANO tier and jetson backend#1481matedev01 wants to merge 2 commits into
matedev01 wants to merge 2 commits into
Conversation
Phase 2 of issue Light-Heart-Labs#195 milestone 1. Stacks on feat/jetson-detection. Adds the JETSON_ORIN_NANO tier for both qwen and gemma4 profiles in installers/lib/tier-map.sh, with conservative model selection sized for the Orin Nano 8GB unified-memory budget: qwen → qwen3.5-2b (~1.5 GB, 8K context) gemma4 → gemma-4-e2b-it (~2.81 GB, 8K context) Both set N_GPU_LAYERS=99 since the Tegra iGPU shares system RAM with the CPU — there is no benefit to partial offload on unified memory. Also adds config/backends/jetson.json mirroring nvidia.json (same llama-server contract on port 8080); the runtime difference lives in docker-compose.jetson.yml which is a follow-up PR. Tier validation error lists and tier_to_model() switches updated for both qwen and gemma4 paths so `dream model swap` resolves correctly. Tests: tier-map suite goes from 122 → 135 PASS (13 new Jetson assertions covering both profiles, plus the GGUF_URL coverage loop extension). Out of scope (separate follow-ups): - docker-compose.jetson.yml + resolver branch - Auto-tier selection on Jetson hosts (--tier required for now) - Orin AGX/NX, Xavier, legacy Nano - docs/JETSON-QUICKSTART.md, SUPPORT-MATRIX entry
Contributor
Author
On-hardware verification logSame Orin Nano 8GB Super (JetPack R36.4.7) as #1479. All five Phase 2 checks pass cleanly: 135/135 tier-map tests, both profile resolutions return the expected models, the new Full P2 log — click to expandReviewer notes
|
Collaborator
|
Nice narrow follow-up. The tier and backend contract choices look coherent for the 8 GB unified-memory Orin Nano target, and keeping compose/runtime out of this PR helps the review. Two process notes:
No functional blocker from my pass on this PR by itself. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 2 of issue #195 milestone 1. Stacks on #1479 (Phase 1 detection). Adds the
JETSON_ORIN_NANOtier and ajetsonbackend contract — pure data, no compose changes, no runtime impact.Why
PR #1479 plumbs
GPU_BACKEND=jetsonthrough detection but the tier-map has no entry, so a user running--tier JETSON_ORIN_NANOtoday getserror "Invalid tier". This PR closes that gap and lets the installer resolve a sane model and context size for Orin Nano hardware.Tier choices
Sized for the 8 GB unified-memory budget on Orin Nano (Ampere sm_87). Leaves ~5 GB free for KV cache + open-webui + dashboard-api + LiteLLM after the model loads.
Both set
N_GPU_LAYERS=99— the Tegra iGPU shares system RAM with the CPU, so partial offload has no upside on unified memory.Larger Orin Nano variants and Orin NX/AGX are deliberately excluded; this milestone is "Orin Nano end-to-end" per the maintainer's 2026-05-10 triage on #195.
Files changed
installers/lib/tier-map.shJETSON_ORIN_NANOcase added to bothset_qwen_tier_config()andset_gemma4_tier_config()switches. Validation error lists at lines 194 + 307 updated.tier_to_model()extended in both profile branches sodream model swapresolves correctlyconfig/backends/jetson.json(new)nvidia.json— samellama-servercontract on port 8080, same provider URL. The runtime difference (JetPack-pinned image, Tegra container runtime) lives indocker-compose.jetson.yml, follow-up PRtests/test-tier-map.shJETSON_ORIN_NANOTest plan
bash tests/test-tier-map.sh # Results: 135 passed, 0 failed (was 122 before; +13 new Jetson assertions)Plus the Phase 1 detection test still passes unchanged:
bash tests/test-jetson-detection.sh # Passed: 12, Failed: 0End-to-end resolution check on the qwen path:
And gemma4:
Explicitly out of scope (separate follow-ups)
docker-compose.jetson.ymland resolver branch inscripts/resolve-compose-stack.sh--tier JETSON_ORIN_NANOrequired for now)docs/JETSON-QUICKSTART.mdandSUPPORT-MATRIX.mdentryStack note
Stacks on #1479 (
feat/jetson-detection). If #1479 needs rework during review (e.g. renaming the backend value fromjetsontotegra), this branch rebases cleanly — the only cross-PR dependency is the stringjetsonused as the backend identifier.