Add ROCm Strix Halo Bonsai demo path by bong-water-water-bong · Pull Request #71 · PrismML-Eng/Bonsai-demo

bong-water-water-bong · 2026-05-07T00:29:23Z

Adds ROCm/HIP build support for the Bonsai demo, fixes ternary launch paths, records Strix Halo gfx1151 Ternary-Bonsai 8B Q2_0 validation, and adds optional self-hosted ROCm CI coverage. Local validation: setup completed, ROCm build completed, llama-bench Ternary-Bonsai-8B Q2_0 on Strix Halo hit pp512 1323.29 +/- 10.55 t/s and tg128 79.04 +/- 0.57 t/s; run_llama one-shot prompt exits cleanly.

bong-water-water-bong · 2026-05-07T00:38:06Z

Added a second commit with the full Strix Halo ROCm benchmark matrix and raw JSONL artifacts. Coverage now includes Ternary-Bonsai 1.7B, 4B, and 8B Q2_0 with isolated pp/tg plus combined prompt+generation workloads. Head commit: 1a1cc92.

bong-water-water-bong · 2026-05-07T00:58:16Z

Added one more follow-up commit with flash-attention on/off comparison data across 1.7B, 4B, and 8B at pp512/tg128. Short version: keep FA on for this ROCm path; 8B improves from ~1142 to ~1303 tok/s pp512 and ~70 to ~78 tok/s tg128. Head commit: efade28.

khosravipasha · 2026-05-07T22:39:19Z

@@ -0,0 +1,12 @@
+{"build_commit": "d104cf1b6", "build_number": 8846, "cpu_info": "AMD RYZEN AI MAX+ 395 w/ Radeon 8060S", "gpu_info": "AMD Radeon Graphics", "backends": "ROCm", "model_filename": "models/ternary-gguf/1.7B/Ternary-Bonsai-1.7B-Q2_0.gguf", "model_type": "qwen3 1.7B Q2_0", "model_size": 457345184, "model_n_params": 1720028160, "n_batch": 2048, "n_ubatch": 512, "n_threads": 16, "cpu_mask": "0x0", "cpu_strict": false, "poll": 50, "type_k": "f16", "type_v": "f16", "n_gpu_layers": 99, "n_cpu_moe": 0, "split_mode": "layer", "main_gpu": 0, "no_kv_offload": false, "flash_attn": false, "devices": "auto", "tensor_split": "0.00", "tensor_buft_overrides": "none", "use_mmap": true, "use_direct_io": false, "embeddings": false, "no_op_offload": 0, "no_host": false, "fit_target": 0, "fit_min_ctx": 0, "n_prompt": 512, "n_gen": 0, "n_depth": 0, "test_time": "2026-05-07T00:56:16Z", "avg_ns": 105712742, "stddev_ns": 797694, "avg_ts": 4843.498607, "stddev_ts": 36.703200, "samples_ns": [ 106239328, 106103931, 104794967 ],"samples_ts": [ 4819.31, 4825.46, 4885.73 ]}


do we want the jsonl files, might be too much info.

khosravipasha · 2026-05-07T22:39:32Z

@@ -0,0 +1,18 @@
+{"build_commit": "d104cf1b6", "build_number": 8846, "cpu_info": "AMD RYZEN AI MAX+ 395 w/ Radeon 8060S", "gpu_info": "AMD Radeon Graphics", "backends": "ROCm", "model_filename": "models/ternary-gguf/1.7B/Ternary-Bonsai-1.7B-Q2_0.gguf", "model_type": "qwen3 1.7B Q2_0", "model_size": 457345184, "model_n_params": 1720028160, "n_batch": 2048, "n_ubatch": 512, "n_threads": 16, "cpu_mask": "0x0", "cpu_strict": false, "poll": 50, "type_k": "f16", "type_v": "f16", "n_gpu_layers": 99, "n_cpu_moe": 0, "split_mode": "layer", "main_gpu": 0, "no_kv_offload": false, "flash_attn": true, "devices": "auto", "tensor_split": "0.00", "tensor_buft_overrides": "none", "use_mmap": true, "use_direct_io": false, "embeddings": false, "no_op_offload": 0, "no_host": false, "fit_target": 0, "fit_min_ctx": 0, "n_prompt": 512, "n_gen": 0, "n_depth": 0, "test_time": "2026-05-07T00:32:57Z", "avg_ns": 94925047, "stddev_ns": 339642, "avg_ts": 5393.775104, "stddev_ts": 19.266487, "samples_ns": [ 94868643, 94617421, 95289079 ],"samples_ts": [ 5396.94, 5411.27, 5373.12 ]}


same here: do we want the jsonl files, might be too much info.

khosravipasha · 2026-05-07T22:40:39Z

+    inputs:
+      enable_linux_amd:
+        description: "Run optional self-hosted Linux AMD/ROCm build"
+        required: false
+        default: false
+        type: boolean


we don't have that oursevles, is that something in your setup?
I guess can push it here anyway and it just won't run? Might want to make sure does not cause the gituhb action to fail

khosravipasha · 2026-05-07T22:40:59Z

@@ -0,0 +1,15 @@
+{"build_commit": "d104cf1b6", "build_number": 8846, "cpu_info": "AMD RYZEN AI MAX+ 395 w/ Radeon 8060S", "gpu_info": "AMD Radeon Graphics", "backends": "ROCm", "model_filename": "models/ternary-gguf/1.7B/Ternary-Bonsai-1.7B-Q2_0.gguf", "model_type": "qwen3 1.7B Q2_0", "model_size": 457345184, "model_n_params": 1720028160, "n_batch": 2048, "n_ubatch": 512, "n_threads": 16, "cpu_mask": "0x0", "cpu_strict": false, "poll": 50, "type_k": "f16", "type_v": "f16", "n_gpu_layers": 99, "n_cpu_moe": 0, "split_mode": "layer", "main_gpu": 0, "no_kv_offload": false, "flash_attn": true, "devices": "auto", "tensor_split": "0.00", "tensor_buft_overrides": "none", "use_mmap": true, "use_direct_io": false, "embeddings": false, "no_op_offload": 0, "no_host": false, "fit_target": 0, "fit_min_ctx": 0, "n_prompt": 128, "n_gen": 0, "n_depth": 0, "test_time": "2026-05-07T00:30:49Z", "avg_ns": 28543656, "stddev_ns": 173748, "avg_ts": 4484.491304, "stddev_ts": 27.212874, "samples_ns": [ 28520343, 28468025, 28590041, 28334580, 28805294 ],"samples_ts": [ 4488.02, 4496.27, 4477.08, 4517.45, 4443.63 ]}


same here, do we want the jsonl files, might be too much info.

khosravipasha · 2026-05-07T22:42:14Z

-
-    $FamilyDisplay = "Ternary-Bonsai"
-} else {
-    $ModelDir = Join-Path $DemoDir "models\gguf\$BonsaiModel"
-    $FamilyDisplay = "Bonsai"
-}
-


I made some changes here, recetnly I think the conflict is realted to this

Copilot

Pull request overview

Adds ROCm/HIP support to the Bonsai demo toolchain (build + runtime), plus Strix Halo (gfx1151) validation artifacts and optional self-hosted CI coverage.

Changes:

Introduces a Linux ROCm/HIP source build script and updates docs to reference it.
Fixes/standardizes Bonsai vs Ternary-Bonsai display/model-path handling and improves one-shot prompt execution by auto-enabling --single-turn when a prompt/file is provided.
Adds Strix Halo ROCm benchmark/validation results and an optional self-hosted ROCm smoke job.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
scripts/start_llama_server.ps1	Simplifies family display handling for GGUF discovery/errors.
scripts/run_llama.sh	Auto-adds `--single-turn` when `-p/--prompt` or `-f/--file` is used.
scripts/run_llama.ps1	Auto-adds `--single-turn` when prompt/file args are present.
scripts/common.sh	Prepends `/opt/rocm/bin` and `/opt/rocm/lib` to PATH/LD_LIBRARY_PATH when present.
scripts/build_rocm_linux.sh	New: builds PrismML llama.cpp with ROCm/HIP and installs to `bin/rocm`.
README.md	Documents ROCm/HIP support and points to ROCm benchmark write-up; updates build instructions.
community-benchmarks/ternary-bonsai/rocm-hip-strix-halo-128gb-linux.md	New: Strix Halo ROCm HIP benchmark/validation report.
community-benchmarks/ternary-bonsai/README.md	Adds Strix Halo ROCm result entry.
community-benchmarks/README.md	Adds combined-table Ternary-Bonsai ROCm result entry.
benchmarks/data/ternary-bonsai-rocm-strix-halo-fa-compare-20260507T005616Z.jsonl	New: raw JSONL data for FA on/off comparison.
benchmarks/data/ternary-bonsai-rocm-strix-halo-combined-20260507T003257Z.jsonl	New: raw JSONL data for combined prompt+gen runs.
benchmarks/data/ternary-bonsai-rocm-strix-halo-20260507T003049Z.jsonl	New: raw JSONL data for isolated prompt/decode runs.
.github/workflows/check-env-vars.yml	Expands syntax checks to additional shell scripts incl. ROCm build script.
.github/workflows/build-from-source-smoke.yml	Adds optional workflow_dispatch-controlled self-hosted ROCm build+smoke job.
.github/CI.md	Documents the optional self-hosted ROCm job.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        --rocm-path) ROCM_PATH="$2"; shift 2 ;;
+        --targets) AMDGPU_TARGETS="$2"; shift 2 ;;
+        --output) OUTPUT_DIR="$2"; shift 2 ;;
+        *) REPO_DIR="$1"; shift ;;


khosravipasha · 2026-05-07T22:42:47Z

 export LD_LIBRARY_PATH="$BIN_DIR${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"

 NGL=$(bonsai_llama_ngl)
+SINGLE_TURN_ARGS=""


what's the purpose of this one? I thought already setup up single turn

… <223556219+Copilot@users.noreply.github.com>

khosravipasha · 2026-05-31T03:12:15Z

There are some merge conflicts to be resolved, and seems too many files that are not needed.
To merge this I would need minimal code/md files.

bong-water-water-bong added 2 commits May 6, 2026 21:27

Add ROCm Strix Halo Bonsai demo path

95670eb

Expand Strix Halo ternary ROCm benchmarks

1a1cc92

Add Strix Halo flash attention comparison

efade28

khosravipasha requested a review from Copilot May 7, 2026 22:38

Copilot started reviewing on behalf of khosravipasha May 7, 2026 22:39 View session

khosravipasha reviewed May 7, 2026

View reviewed changes

Copilot AI reviewed May 7, 2026

View reviewed changes

Comment thread scripts/build_rocm_linux.sh

Comment on lines +30 to +35

while [[ $# -gt 0 ]]; do

case "$1" in

--rocm-path) ROCM_PATH="$2"; shift 2 ;;

--targets) AMDGPU_TARGETS="$2"; shift 2 ;;

--output) OUTPUT_DIR="$2"; shift 2 ;;

*) REPO_DIR="$1"; shift ;;

khosravipasha reviewed May 7, 2026

View reviewed changes

bong-water-water-bong added 2 commits May 24, 2026 17:42

Add docs/AGENTS.md for agent orchestration\n\nCo-authored-by: Copilot…

86a328b

… <223556219+Copilot@users.noreply.github.com>

Add OpenSpec LLM Wiki repo standard

75d29ab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ROCm Strix Halo Bonsai demo path#71

Add ROCm Strix Halo Bonsai demo path#71
bong-water-water-bong wants to merge 5 commits into
PrismML-Eng:mainfrom
bong-water-water-bong:main

bong-water-water-bong commented May 7, 2026

Uh oh!

bong-water-water-bong commented May 7, 2026

Uh oh!

bong-water-water-bong commented May 7, 2026

Uh oh!

khosravipasha May 7, 2026

Uh oh!

khosravipasha May 7, 2026

Uh oh!

khosravipasha May 7, 2026

Uh oh!

khosravipasha May 7, 2026

Uh oh!

khosravipasha May 7, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

khosravipasha May 7, 2026

Uh oh!

khosravipasha commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -0,0 +1,12 @@
		{"build_commit": "d104cf1b6", "build_number": 8846, "cpu_info": "AMD RYZEN AI MAX+ 395 w/ Radeon 8060S", "gpu_info": "AMD Radeon Graphics", "backends": "ROCm", "model_filename": "models/ternary-gguf/1.7B/Ternary-Bonsai-1.7B-Q2_0.gguf", "model_type": "qwen3 1.7B Q2_0", "model_size": 457345184, "model_n_params": 1720028160, "n_batch": 2048, "n_ubatch": 512, "n_threads": 16, "cpu_mask": "0x0", "cpu_strict": false, "poll": 50, "type_k": "f16", "type_v": "f16", "n_gpu_layers": 99, "n_cpu_moe": 0, "split_mode": "layer", "main_gpu": 0, "no_kv_offload": false, "flash_attn": false, "devices": "auto", "tensor_split": "0.00", "tensor_buft_overrides": "none", "use_mmap": true, "use_direct_io": false, "embeddings": false, "no_op_offload": 0, "no_host": false, "fit_target": 0, "fit_min_ctx": 0, "n_prompt": 512, "n_gen": 0, "n_depth": 0, "test_time": "2026-05-07T00:56:16Z", "avg_ns": 105712742, "stddev_ns": 797694, "avg_ts": 4843.498607, "stddev_ts": 36.703200, "samples_ns": [ 106239328, 106103931, 104794967 ],"samples_ts": [ 4819.31, 4825.46, 4885.73 ]}

		@@ -0,0 +1,18 @@
		{"build_commit": "d104cf1b6", "build_number": 8846, "cpu_info": "AMD RYZEN AI MAX+ 395 w/ Radeon 8060S", "gpu_info": "AMD Radeon Graphics", "backends": "ROCm", "model_filename": "models/ternary-gguf/1.7B/Ternary-Bonsai-1.7B-Q2_0.gguf", "model_type": "qwen3 1.7B Q2_0", "model_size": 457345184, "model_n_params": 1720028160, "n_batch": 2048, "n_ubatch": 512, "n_threads": 16, "cpu_mask": "0x0", "cpu_strict": false, "poll": 50, "type_k": "f16", "type_v": "f16", "n_gpu_layers": 99, "n_cpu_moe": 0, "split_mode": "layer", "main_gpu": 0, "no_kv_offload": false, "flash_attn": true, "devices": "auto", "tensor_split": "0.00", "tensor_buft_overrides": "none", "use_mmap": true, "use_direct_io": false, "embeddings": false, "no_op_offload": 0, "no_host": false, "fit_target": 0, "fit_min_ctx": 0, "n_prompt": 512, "n_gen": 0, "n_depth": 0, "test_time": "2026-05-07T00:32:57Z", "avg_ns": 94925047, "stddev_ns": 339642, "avg_ts": 5393.775104, "stddev_ts": 19.266487, "samples_ns": [ 94868643, 94617421, 95289079 ],"samples_ts": [ 5396.94, 5411.27, 5373.12 ]}

		@@ -0,0 +1,15 @@
		{"build_commit": "d104cf1b6", "build_number": 8846, "cpu_info": "AMD RYZEN AI MAX+ 395 w/ Radeon 8060S", "gpu_info": "AMD Radeon Graphics", "backends": "ROCm", "model_filename": "models/ternary-gguf/1.7B/Ternary-Bonsai-1.7B-Q2_0.gguf", "model_type": "qwen3 1.7B Q2_0", "model_size": 457345184, "model_n_params": 1720028160, "n_batch": 2048, "n_ubatch": 512, "n_threads": 16, "cpu_mask": "0x0", "cpu_strict": false, "poll": 50, "type_k": "f16", "type_v": "f16", "n_gpu_layers": 99, "n_cpu_moe": 0, "split_mode": "layer", "main_gpu": 0, "no_kv_offload": false, "flash_attn": true, "devices": "auto", "tensor_split": "0.00", "tensor_buft_overrides": "none", "use_mmap": true, "use_direct_io": false, "embeddings": false, "no_op_offload": 0, "no_host": false, "fit_target": 0, "fit_min_ctx": 0, "n_prompt": 128, "n_gen": 0, "n_depth": 0, "test_time": "2026-05-07T00:30:49Z", "avg_ns": 28543656, "stddev_ns": 173748, "avg_ts": 4484.491304, "stddev_ts": 27.212874, "samples_ns": [ 28520343, 28468025, 28590041, 28334580, 28805294 ],"samples_ts": [ 4488.02, 4496.27, 4477.08, 4517.45, 4443.63 ]}

Uh oh!

Conversation

bong-water-water-bong commented May 7, 2026

Uh oh!

bong-water-water-bong commented May 7, 2026

Uh oh!

bong-water-water-bong commented May 7, 2026

Uh oh!

khosravipasha May 7, 2026

Choose a reason for hiding this comment

Uh oh!

khosravipasha May 7, 2026

Choose a reason for hiding this comment

Uh oh!

khosravipasha May 7, 2026

Choose a reason for hiding this comment

Uh oh!

khosravipasha May 7, 2026

Choose a reason for hiding this comment

Uh oh!

khosravipasha May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

khosravipasha May 7, 2026

Choose a reason for hiding this comment

Uh oh!

khosravipasha commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants