Releases · tetherto/qvac

10 Jun 14:23

Zbig9000

tts-ggml-v0.2.2

e8b61d9

QVAC TTS GGML Addon v0.2.2 Latest

Latest

@qvac/tts-ggml 0.2.2

Fixed

Android: revert the tts-cpp 2026-06-05 bump (introduced in 0.2.1) that crashed the addon at dlopen during bootstrap, failing every Android e2e run. tts-cpp 2026-06-05 (upstream qvac-ext-lib-whisper.cpp@128dae42, the QVAC-19254 sched + cpu_backend refactor) added direct ggml_backend_is_cpu / ggml_get_type_traits_cpu calls in the statically-linked tts-cpp library. On Android the shared ggml-speech port builds the CPU backend as runtime-dlopen'd per-microarch MODULE .so variants (GGML_CPU_ALL_VARIANTS=ON + GGML_BACKEND_DL=ON; no static CPU archive), so those symbols are left UND in libqvac__tts-ggml.*.so and unresolvable when Bare loads the addon → ADDON_NOT_FOUND / dlopen failed → SIGABRT ~1s into bootstrap. iOS and desktop statically link the CPU backend and were unaffected. Pin tts-cpp back to 2026-06-03#1 (the last-known-good revision shipped in 0.2.0) so the Android addon loads cleanly again.

Reverted

Reverts the 0.2.1 Supertonic GPU enablement (QVAC-19255, #2473) in full: the tts-cpp pin, the SupertonicModel.cpp / index.js useGPU / nGpuLayers gate removals, the flipped C++/integration tests, and the docs. With tts-cpp back at 2026-06-03#1, Supertonic is CPU-only again (it is heavily CPU-optimised). The Supertonic GPU work will re-land once the Android CPU-backend linkage is fixed upstream (QVAC-19254 follow-up).

Assets 2

09 Jun 13:49

github-actions

vla-v0.3.2

293a871

QVAC Vla Addon v0.3.2

Pinned to the Fabric revision used by the M-RoPE/iM-RoPE sliding-context work.

Pull Requests

#2438 - feat[notask]: add M-RoPE sliding context support

Assets 2

09 Jun 13:50

github-actions

llamacpp-llm-v0.24.0

293a871

QVAC LLM Addon v0.24.0

This release adds sliding-context support for M-RoPE/iM-RoPE models such as Qwen3.5 and Qwen-VL style decoders. Long-running multimodal sessions can now slide under context pressure while preserving image recall, cache save/load behavior, and quantized KV-cache operation.

Features

M-RoPE/iM-RoPE sliding context

llm-llamacpp now tracks multimodal context usage as both logical decoder positions and physical KV-cache cells. This lets Qwen3.5-style prompts slide at the right time even when image chunks occupy a different number of cache cells than position slots.

Context sliding now supports bounded full-wipe and tail-preserving fallback behavior while respecting the configured discard budget. Native KV memory-operation failures surface as ContextSlideFailed, making them distinguishable from ordinary context overflow.

Shifted multimodal cache metadata now persists both logical positions and KV-cache usage, so sessions that slide after image turns can be saved and loaded without losing track of protected prefixes or current cache occupancy.

Quantized KV-cache sliding coverage

The local qvac-fabric overlay now points at the Fabric branch with M-RoPE/iM-RoPE K-shift support and quantized KV-cache shift handling. Integration coverage exercises Qwen3.5 text sliding, tool-compaction pressure, multimodal image recall after sliding save/load, quantized K-cache sliding, and Llama RoPE baseline sliding.

New APIs

`ContextSlideFailed`

ContextSlideFailed is a new addon error code used when Fabric/native KV memory operations reject a sliding range. Callers can now tell this apart from context overflow, where there is simply not enough room to append the requested tokens.

Pull Requests

#2438 - feat[notask]: add M-RoPE sliding context support

Assets 2

09 Jun 13:51

github-actions

llamacpp-embed-v0.19.1

293a871

QVAC Embed Addon v0.19.1

Changed

Pinned the Fabric revision used by the M-RoPE/iM-RoPE sliding-context work.

Pull Requests

#2438 - feat[notask]: add M-RoPE sliding context support

Assets 2

09 Jun 13:50

github-actions

classification-ggml-v0.3.1

293a871

QVAC GGML Image Classification Lib v0.3.1

Changed

Pinned to the Fabric revision used by the M-RoPE/iM-RoPE sliding-context work.

Pull Requests

#2438 - feat[notask]: add M-RoPE sliding context support

Assets 2

05 Jun 14:38

github-actions

llamacpp-embed-v0.19.0

63e993b

QVAC Embed Addon v0.19.0

Changed

feat[bc]: RuntimeStats.context_size now reports the active runtime llama context size. Use the new RuntimeStats.trained_context_size field for the model's trained context size.
The embed runtime now defaults ctx_size to the model's trained context size and caps oversized ctx_size requests to that value before creating the llama context. The cap also applies on streamed loads (single-GGUF and sharded) by parsing GGUF metadata from the first streamed chunk before the weights engine consumes it, mirroring the ModelMetaData pattern used by llm-llamacpp.

Fixed

Context overflow validation now compares tokenized inputs against the active runtime context size (llama_n_ctx), which is itself capped to the trained context.
BertModel::setWeightsForFile now tracks fulfilled GGUF shards in a per-instance std::atomic<int> instead of a function-local static int, so multiple concurrent BertInterface instances no longer share (and miscount) shard-fulfillment state.
BertModel now resolves sharded model basenames to absolute paths relative to the model directory before metadata inspection and disk-shards loading, so the trained-context cap and llama_model_load_from_splits work correctly when the working directory differs from the model directory.
readTrainedContextSize now logs an ERROR-level diagnostic when GGUF metadata cannot be read on either streamed or non-streaming loads (previously failed silently and reverted to llama.cpp's default ctx_size).

Assets 2

05 Jun 19:08

github-actions

diffusion-cpp-v0.11.2

751bade

QVAC Stable Diffusion Addon v0.11.2

This release restores caller control over where the diffusion text-conditioning path runs on macOS. It removes an Apple-specific override that forced the CLIP/text encoder path onto CPU.

Bug Fixes

Honor `clip_on_cpu` on macOS

macOS builds no longer force keep_clip_on_cpu to true during SdModel::load(). The addon now forwards config_.keepClipOnCpu on all platforms, so callers can keep the text-conditioning path on the configured backend unless they explicitly opt into CPU placement with clip_on_cpu.

Assets 2

04 Jun 12:49

github-actions

sdk-v0.12.2

2a45919

QVAC SDK v0.12.2

📦 NPM: https://www.npmjs.com/package/@qvac/sdk/v/0.12.2

This patch release unblocks React Native and BareKit apps that bundle @qvac/sdk or @qvac/bare-sdk. Metro and Bare static analysis no longer reject the config loader, and clients can import the model registry through a dedicated subpath without pulling the full SDK graph into the bundle.

New APIs

`@qvac/sdk/models` and `@qvac/bare-sdk/models` subpaths

React Native apps that only need model constant names previously had to import from the package root, which dragged server-side modules into Metro. v0.12.2 adds a ./models export on both @qvac/sdk and @qvac/bare-sdk so you can depend on the registry alone.

import { LLAMA_3_2_1B_INST_Q4_0 } from "@qvac/sdk/models";
// or on Bare-only clients:
import { LLAMA_3_2_1B_INST_Q4_0 } from "@qvac/bare-sdk/models";

Bug Fixes

Bare config loader works under Metro static analysis

BareKit and Expo consumers could fail at bundle time with errors such as Invalid call: import(filePath) when the SDK resolved qvac.config.js. The Bare config loader used dynamic import() with a runtime path, which Metro and Bare reject because the target is not a string literal.

v0.12.2 loads .js and .json config files with require(filePath) instead, which satisfies static analysis while keeping the same resolution order (QVAC_CONFIG_PATH, then project-root qvac.config.js / qvac.config.json, then defaults). Supported extensions are centralized in SUPPORTED_CONFIG_FILE_EXTS so discovery and validation stay aligned. TypeScript config files (.ts) are explicitly rejected on the Bare path with a clear error — use .js or .json in RN/Bare projects.

Assets 2

05 Jun 07:32

github-actions

diagnostics-v0.1.2

7084527

QVAC diagnostics Lib v0.1.2

This patch aligns @qvac/diagnostics with the monorepo’s simplified package layout and streamlines runtime and OS detection when building diagnostic reports.

Features

Monorepo layout alignment

The package now lives under the standard packages/diagnostics tree from the monorepo path simplification. Published entry points are unchanged; release and CI follow the same patterns as other QVAC add-on libraries.

Other

Simpler runtime and environment detection

Environment collection uses which-runtime for platform, architecture, and runtime version, and resolves os through package imports so Bare and Node get the right implementation without probing bare-process or multiple fallback require paths at load time.

const w = require('which-runtime')
const os = (w.isNode || w.isBare) ? require('os') : null

Hardware probing (CPU model, core count, memory) still uses os when available.

Pull Requests

#1860 - QVAC-16441 feat: simplify package folders, files and paths in the monorepo
#2157 - simplify

Assets 2

03 Jun 11:10

github-actions

sdk-v0.12.1

c131d38

QVAC SDK v0.12.1

📦 NPM: https://www.npmjs.com/package/@qvac/sdk/v/0.12.1

This is a patch release on top of v0.12.0. It surfaces two new error classes so callers can distinguish a crashed bare worker from an in-flight call cancelled by SDK shutdown, and it fixes a Qwen 3.5/3.6 tool-call regression where capitalised booleans were silently dropping the entire tool call.

New APIs

Distinguish bare worker crashes from shutdown cancellations

Calls made through a bare worker (e.g. sdk.embed, sdk.complete) previously rejected with a generic RPC error if the worker process died mid-request or if sdk.close() was called while the request was in flight. Both cases looked identical to callers, so retry/UX logic had to guess.

v0.12.1 introduces two structured RPC errors that propagate from the worker bridge:

WorkerCrashedError — the bare worker died unexpectedly. Exposes exitCode and exitSignal so you can tell a SIGKILL from a clean non-zero exit and decide whether to respawn.
WorkerShutdownError — the SDK is shutting down (sdk.close() was called) while this request was still in flight. Safe to swallow on intentional teardown; surfaces an actionable label for callers who want to log it.

import { WorkerCrashedError, WorkerShutdownError } from "@qvac/sdk";

try {
  await sdk.embed({ modelId, text: "hi" });
} catch (err) {
  if (err instanceof WorkerCrashedError) {
    // err.exitCode, err.exitSignal — worker died, decide whether to respawn.
  } else if (err instanceof WorkerShutdownError) {
    // SDK is shutting down; this call was cancelled by close().
  }
}

Existing catch (err) blocks that don't narrow by class continue to work unchanged — the new classes both extend the same RPC error base.

Bug Fixes

Qwen 3.5/3.6 tool calls with capitalised booleans no longer drop silently

Qwen 3.5/3.6 (the default tool-calling family) intermittently emits Python-style True / False for boolean parameters instead of the JSON-strict true / false. The qwen35 parser only accepted the exact lowercase literals, so coercion threw, the parser returned an empty toolCalls array, and the raw <tool_call>…</tool_call> markup leaked into the assistant's final text answer — there was no PARSE_ERROR, the tool call just vanished.

v0.12.1 lowercases the value before comparing in the boolean coercion path, so True, False, TRUE, and FALSE all coerce correctly. Genuinely invalid values (maybe, 0, null) still throw PARSE_ERROR — the relaxation is intentionally scoped to casing. Other tool-call dialects are unaffected.

Assets 2

Releases: tetherto/qvac

QVAC TTS GGML Addon v0.2.2

@qvac/tts-ggml 0.2.2

Fixed

Reverted

Uh oh!

QVAC Vla Addon v0.3.2

Pull Requests

Uh oh!

QVAC LLM Addon v0.24.0

Features

M-RoPE/iM-RoPE sliding context

Quantized KV-cache sliding coverage

New APIs

ContextSlideFailed

Pull Requests

Uh oh!

QVAC Embed Addon v0.19.1

Changed

Pull Requests

Uh oh!

QVAC GGML Image Classification Lib v0.3.1

Changed

Pull Requests

Uh oh!

QVAC Embed Addon v0.19.0

Changed

Fixed

Uh oh!

QVAC Stable Diffusion Addon v0.11.2

Bug Fixes

Honor clip_on_cpu on macOS

Uh oh!

QVAC SDK v0.12.2

New APIs

@qvac/sdk/models and @qvac/bare-sdk/models subpaths

Bug Fixes

Bare config loader works under Metro static analysis

Uh oh!

QVAC diagnostics Lib v0.1.2

Features

Monorepo layout alignment

Other

Simpler runtime and environment detection

Pull Requests

Uh oh!

QVAC SDK v0.12.1

New APIs

Distinguish bare worker crashes from shutdown cancellations

Bug Fixes

Qwen 3.5/3.6 tool calls with capitalised booleans no longer drop silently

Uh oh!

`ContextSlideFailed`

Honor `clip_on_cpu` on macOS

`@qvac/sdk/models` and `@qvac/bare-sdk/models` subpaths