Commit 3717809
feat(local-inference): catalog opts-in for DFlash kernel + AWQ Q4 entry
Marks all three DFlash entries (qwen3.5-4b, qwen3.5-9b, qwen3.6-27b)
with runtime.optimizations.requiresKernel: ["dflash"] so the dispatcher
routes them to llama-server even when ELIZA_LOCAL_BACKEND=node-llama-cpp
is set — the in-process binding cannot satisfy the kernel requirement.
Adds one AWQ-derived GGUF entry — Qwen3 Coder 30B A3B (MoE, AWQ→Q4_K_M
from straino/Qwen3-Coder-30B-A3B-Instruct-AWQ-4bit-Q4_K_M-GGUF, HEAD
verified). The entry declares moeOffload: "cpu" so MoE expert tensors
default to CPU memory and the active path stays on the GPU.
GPTQ-derived GGUF entries are deliberately omitted: the only repos that
ship them today are low-confidence re-quants (RichardErkhov, namtran,
casualjim). bartowski and TheBloke do not publish first-party GPTQ
GGUFs. Operators can still install ad-hoc GGUFs through the HF search
path; we will revisit when a first-party publisher ships GPTQ GGUFs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 42d645a commit 3717809
1 file changed
Lines changed: 49 additions & 0 deletions
Lines changed: 49 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
79 | 79 | | |
80 | 80 | | |
81 | 81 | | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
82 | 86 | | |
83 | 87 | | |
84 | 88 | | |
| |||
139 | 143 | | |
140 | 144 | | |
141 | 145 | | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
142 | 150 | | |
143 | 151 | | |
144 | 152 | | |
| |||
233 | 241 | | |
234 | 242 | | |
235 | 243 | | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
236 | 281 | | |
237 | 282 | | |
238 | 283 | | |
| |||
299 | 344 | | |
300 | 345 | | |
301 | 346 | | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
302 | 351 | | |
303 | 352 | | |
304 | 353 | | |
| |||
0 commit comments