Skip to content
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
104 commits
Select commit Hold shift + click to select a range
854c6b4
Restore Qwen3.5 / Gemma4 / PaddleOCR-VL tests + Mali coopmat fix
zoq May 3, 2026
1586787
API port + Gemma4 tool-call fix.
zoq May 4, 2026
d87331b
Wire addon/src/patches ahead of the vcpkg include path to pick up the…
zoq May 4, 2026
3f1213d
API port + Gemma4 tool-call fix.
zoq May 4, 2026
47b8371
Split iOS heavy4 into three single-test specs (heavy4 = OcrLighton, n…
zoq May 4, 2026
15b7c50
Drop LlamacppUtils.hpp patch override; bump addon-cpp to 1.1.7
gianni-cor May 4, 2026
7d4c97b
Merge branch 'main' into cpp-sanity-fixes-rebased
gianni-cor May 4, 2026
6165062
Merge branch 'main' into cpp-sanity-fixes-rebased
gianni-cor May 4, 2026
fd81176
Cap ocr-lighton predict to 1800 (desktop) / 768 (mobile) so the Light…
zoq May 4, 2026
d628d48
Rewrite sliding-context test to use the post-GGML_PAD effective n_ctx…
zoq May 4, 2026
1408896
Allow embed batching test to override ctx_size and pin gte-large to b…
zoq May 4, 2026
952e1f5
Fix reverse-prompt scenario by removing comma, space, listing both 'p…
zoq May 4, 2026
5d2f2fa
Sanitize media Uint8Array prompts before logging to avoid V8 Zone OOM.
zoq May 4, 2026
d3cf289
Merge branch 'main' into cpp-sanity-fixes-rebased
gianni-cor May 4, 2026
0767fc5
Use Qwen3 family chat-template to fix Qwen3.5-0.8B gibberish output o…
zoq May 4, 2026
0cb7925
Update portfiles to point to the latest fabric.
zoq May 4, 2026
ceb9bb2
Revert "Allow embed batching test to override ctx_size and pin gte-la…
zoq May 5, 2026
b59d9a0
Raise AfriqueGemma cancel maxWait to 60s, and apply the use_jinja gat…
zoq May 5, 2026
160e8fe
Merge branch 'main' into cpp-sanity-fixes-rebased
gianni-cor May 5, 2026
e257a19
Drop the retired AfriqueGemma integration tests.
zoq May 5, 2026
3938f04
Update portfiles to point to the latest head.
zoq May 5, 2026
60ab420
Merge branch 'main' into cpp-sanity-fixes-rebased
gianni-cor May 5, 2026
3bbfbdd
Update portfiles to point to the latest head.
zoq May 5, 2026
7847ee2
Drop qwen35 from the Qwen3-template detection and the supported-finet…
zoq May 5, 2026
e7d4e22
Update portfiles to point to the latest head.
zoq May 5, 2026
b0cc8c9
Enable coopmat.
zoq May 5, 2026
880551d
Drop the Qwen3 use_jinja override pairing now that qwen35 is no longe…
zoq May 6, 2026
4534f9c
Use only general.architecture for Qwen3 detection so Qwen3.5 stops ge…
gianni-cor May 6, 2026
b7baccc
Accept HuggingFace function-call XML in extractToolCalls so the Qwen3…
gianni-cor May 6, 2026
5917a77
Bump n_predict in the Qwen3.5 basic and multi-turn integration tests …
gianni-cor May 6, 2026
8bfab83
Enable coopmat and point to the latest fabric.
zoq May 6, 2026
b40fb05
Merge branch 'main' into cpp-sanity-fixes-rebased
gianni-cor May 6, 2026
cc7748f
Route Qwen3.5 inference and all finetuning on Mali to CPU, disable Vu…
zoq May 6, 2026
9f99a3d
Merge branch 'main' into cpp-sanity-fixes-rebased
gianni-cor May 6, 2026
544cc7d
Point to the latest fabric version.
zoq May 6, 2026
afd756d
Merge branch 'main' into cpp-sanity-fixes-rebased
gianni-cor May 7, 2026
7e3e870
Force Bert to the CPU on Mali.
zoq May 7, 2026
9d18855
Run finetuning on Mali GPU.
zoq May 7, 2026
4c90722
Run Qwen 3.5 on Mali GPU.
zoq May 7, 2026
992dba3
Point to the latest fabric version and enable coopmat path.
zoq May 7, 2026
59fb054
Merge branch 'main' into cpp-sanity-fixes-rebased
gianni-cor May 7, 2026
ad9c01e
Merge branch 'main' into cpp-sanity-fixes-rebased
gianni-cor May 7, 2026
10200ff
vcpkg: drop per-package qvac-fabric overlays
gianni-cor May 7, 2026
2e3fcd4
vcpkg: bump qvac-fabric version constraint to 8189.0.0
gianni-cor May 7, 2026
7ca8653
llm/embed/nmtcpp: bump versions for qvac-fabric 8189.0.0
gianni-cor May 7, 2026
3163389
nmtcpp: opt into qvac-fabric gpu-backends feature; downgrade bump to …
gianni-cor May 7, 2026
3e9da07
nmtcpp: re-bump to 3.0.0 (major)
gianni-cor May 7, 2026
d38b211
vcpkg: pin qvac-fabric to >=8189.0.0#1
gianni-cor May 7, 2026
cd0d009
docs: drop overlay-removal note from changelogs
gianni-cor May 7, 2026
a7b76ae
test/llm: restore AfriqueGemma integration tests (desktop-only)
gianni-cor May 7, 2026
930fe3c
docs(llm): drop AfriqueGemma test restoration changelog note
gianni-cor May 7, 2026
9b35c55
test/llm: switch AfriqueGemma desktop-only skip to in-test pattern
gianni-cor May 7, 2026
938b5c4
test/llm: skip ocr-lighton on mobile
gianni-cor May 7, 2026
e9bb053
ci: revert workflow timeout change for llm mobile integration
gianni-cor May 7, 2026
24de4d6
addons: disable flash-attn by default on the OpenCL backend
gianni-cor May 7, 2026
4824f10
Merge branch 'main' into cpp-sanity-fixes-rebased
gianni-cor May 7, 2026
d948ce1
fixup! tuneConfigMap: keep ABI for existing 4-arg test callers
gianni-cor May 7, 2026
5ba1b51
Add QWen 3.5 vision test.
zoq May 7, 2026
929ce85
Route vision models with mmproj to CPU on Apple M1.
zoq May 7, 2026
18e122d
Route only the projector to CPU on Apple M1.
zoq May 7, 2026
83db7a4
run qwen3-5.test.js on IOS GPU
gianni-cor May 7, 2026
bccd1b8
js lint
gianni-cor May 7, 2026
98b8aa1
Recognize Gemma 4 channel reasoning markers in Qwen3ReasoningUtils, a…
zoq May 8, 2026
2cc1a76
Wire reasoning-budget config to inputs.enable_thinking so passing rea…
zoq May 8, 2026
9de21ed
vcpkg: bump qvac-fabric to >=8189.0.1
gianni-cor May 8, 2026
8fe52e4
Merge branch 'main' into cpp-sanity-fixes-rebased
gianni-cor May 8, 2026
313a18b
Disable the embed addon's BERT-on-Mali CPU override.
zoq May 8, 2026
f881604
Prepend <think> opener to the visible stream when the chat template f…
zoq May 8, 2026
2ac5de0
Remove the Mali detection plumbing from the embed addon now that BERT…
zoq May 8, 2026
d2e4885
Bump n_predict and ctx_size in the Qwen3.5 reasoning-budget baseline …
zoq May 8, 2026
2f5c079
Restore the mobile finetune dataset to 8 samples.
zoq May 8, 2026
e29836d
Merge remote-tracking branch 'upstream/main' into pr-1874-fabric-8189
gianni-cor May 9, 2026
36de6ec
test: drop AfriqueGemma + MedGemma + Dolphin-MoE tests
gianni-cor May 9, 2026
a12f325
Revert "test: drop AfriqueGemma references from packages/sdk/tests-qvac"
gianni-cor May 9, 2026
a5e4b0b
Restore packages/llm-llamacpp/docs/afriquegemma-translation.md
gianni-cor May 9, 2026
3d7245a
chore: pin qvac-fabric to 8189.0.2 via overlay-ports for testing
gianni-cor May 9, 2026
bf78dc2
chore: pin overlay qvac-fabric to temp-8189 tip f686a1324
gianni-cor May 9, 2026
ec44a59
ci: extend Android LLM mobile test timeouts
gianni-cor May 9, 2026
dde1483
vcpkg: drop qvac-fabric overlay-ports, bump version>= to 8189.0.2
gianni-cor May 9, 2026
243ab81
refactor(llm): drop dead sawMali plumbing from BackendSelection
gianni-cor May 9, 2026
bd1368a
docs(llm): explain why MtmdLlmContext skips inside_reasoning flip
gianni-cor May 9, 2026
f4f0415
fix(llm): narrow tool-call args quoter to leading bare key only
gianni-cor May 9, 2026
f58fe0e
revert(llm): drop synthetic <tool_call>{json}</tool_call> post-proces…
gianni-cor May 10, 2026
87e6c35
test(llm): parse Gemma 4 native tool-call dialect in gemma4.test.js
gianni-cor May 10, 2026
34a9596
revert(llm): drop Apple M1 detection + projector-CPU routing
gianni-cor May 10, 2026
fffd499
revert(llm): drop dead Gemma 4 markers from updateQwen3ReasoningBuffer
gianni-cor May 10, 2026
798c332
test(llm): switch gemma4 fixtures from unsloth to bartowski
gianni-cor May 10, 2026
2843297
test(llm): unblock gemma4 image test on mobile + fix ctx overflow
gianni-cor May 10, 2026
395434a
refactor(llm): drop dead selectToolsCompactMarker(string) overload
gianni-cor May 10, 2026
46cde0c
test(llm): drop redundant useCpuForVision alias; vision runs on GPU o…
gianni-cor May 10, 2026
08f3462
docs(llm): correct thinkingForcedOpen_ comment re: gemma4
gianni-cor May 10, 2026
018c064
docs(changelog): refresh PR-1874 entries to reflect actual shipped scope
gianni-cor May 10, 2026
3b13325
docs(changelog): trim items that round-trip to net-zero in the PR
gianni-cor May 10, 2026
3c65ce1
docs(notice): regenerate NOTICE for embed-llamacpp, llm-llamacpp, tra…
gianni-cor May 10, 2026
a746023
test(llm): make gemma4 reasoning-budget test tolerate model-emitted r…
gianni-cor May 10, 2026
c2a0aa1
types(llm): declare reasoning_budget in LlamaConfig
gianni-cor May 10, 2026
708a453
feat(llm): allow per-request reasoning_budget override in run()
gianni-cor May 10, 2026
b478a8b
test(llm): cover per-request reasoning_budget override on Qwen3.5
gianni-cor May 10, 2026
90e26fa
feat(llm): case-insensitive antiprompt substring matching
gianni-cor May 11, 2026
dd92918
test(llm): stress case-insensitive antiprompt with PiZzA mixed-case e…
gianni-cor May 11, 2026
f42b139
fix(llm): validate reasoning_budget before truncating to int
gianni-cor May 11, 2026
53a79f9
Merge branch 'main' into cpp-sanity-fixes-rebased
gianni-cor May 11, 2026
9415a8b
fix(llm): use std::from_chars for reasoning_budget load-time parse
gianni-cor May 11, 2026
8bd6582
Merge branch 'main' into cpp-sanity-fixes-rebased
gianni-cor May 11, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1145,13 +1145,14 @@ jobs:
local pool_arn="$1"
local name="$2"
local spec_arn="$3"
local job_timeout="${4:-60}"
aws devicefarm schedule-run \
--project-arn "$PROJECT_ARN" \
--device-pool-arn "$pool_arn" \
--app-arn "$APP_ARN" \
--name "$name" \
--test "type=APPIUM_NODE,testPackageArn=$TEST_PACKAGE_ARN,testSpecArn=$spec_arn" \
--execution-configuration jobTimeoutMinutes=60 \
--execution-configuration jobTimeoutMinutes=$job_timeout \
--query 'run.arn' --output text
}

Expand All @@ -1166,7 +1167,7 @@ jobs:
RUN_ARN_1=$(schedule_run_with_pool "$POOL_ARN" "$RUN_NAME-Android-GroupA" "$TEST_SPEC_ARN_A")
echo "βœ… Android Group A scheduled: $RUN_ARN_1"

RUN_ARN_2=$(schedule_run_with_pool "$POOL_ARN" "$RUN_NAME-Android-GroupB" "$TEST_SPEC_ARN_B")
RUN_ARN_2=$(schedule_run_with_pool "$POOL_ARN" "$RUN_NAME-Android-GroupB" "$TEST_SPEC_ARN_B" 90)
echo "βœ… Android Group B scheduled: $RUN_ARN_2"

echo "run_arn_1=$RUN_ARN_1" >> $GITHUB_OUTPUT
Expand Down
4 changes: 4 additions & 0 deletions packages/qvac-lib-infer-llamacpp-embed/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,10 @@ set(CMAKE_POSITION_INDEPENDENT_CODE ON)
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)

find_path(QVAC_LIB_INFERENCE_ADDON_CPP_INCLUDE_DIRS "qvac-lib-inference-addon-cpp/JsInterface.hpp")
# llama-targets.cmake transitively requires OpenSSL::SSL via cpp-httplib's
# IMPORTED interface. Make OpenSSL discoverable before find_package(llama)
# so the target chain resolves on local builds.
find_package(OpenSSL)
find_package(llama CONFIG REQUIRED)

if(WIN32)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -481,7 +481,7 @@ void BertModel::init(common_params& params) {
llama_numa_init(params.numa);

const std::string errorWhenFailed = toString(UnableToLoadModel);
common_init_result llamaInit = initFromConfig(
common_init_result_ptr llamaInit = initFromConfig(
params,
params.model.path,
singleGgufStreamedFiles_,
Expand All @@ -493,8 +493,8 @@ void BertModel::init(common_params& params) {

init_.params = params;
init_.result = std::move(llamaInit);
model_ = init_.result.model.get();
ctx_ = init_.result.context.get();
model_ = init_.result->model();
ctx_ = init_.result->context();
vocab_ = llama_model_get_vocab(model_);
batch_ = llama_batch_init(init_.params.n_batch, 0, 1);
pooling_type = llama_pooling_type(ctx_);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ class BertEmbeddings {

struct BertCommonInitResult {
common_params params;
common_init_result result;
common_init_result_ptr result;
};

/// @brief Instantiates a BERT language model. An open source architecture
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
{
"overlay-ports": [
"./vcpkg/ports"
],
"default-registry": {
"kind": "git",
"baseline": "803c0d119ea002694963e89237c207ff6ecf47f6",
Expand Down
2 changes: 1 addition & 1 deletion packages/qvac-lib-infer-llamacpp-embed/vcpkg.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
},
{
"name": "qvac-lib-inference-addon-cpp",
"version>=": "1.1.5#1"
"version>=": "1.1.7"
},
{
"name": "qvac-lint-cpp",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Function to detect Vulkan version from NDK vulkan_core.h
function(detect_ndk_vulkan_version)
string(TOLOWER "${CMAKE_HOST_SYSTEM_NAME}" host_system_name_lower)

# CMAKE_HOST_SYSTEM_PROCESSOR is unavailable here. Use a glob pattern to complete the folder instead.
file(GLOB host_dirs LIST_DIRECTORIES true "$ENV{ANDROID_NDK_HOME}/toolchains/llvm/prebuilt/${host_system_name_lower}-*")
if(host_dirs)
list(GET host_dirs 0 host_dir)
get_filename_component(host_arch "${host_dir}" NAME)
set(vulkan_core_h "$ENV{ANDROID_NDK_HOME}/toolchains/llvm/prebuilt/${host_arch}/sysroot/usr/include/vulkan/vulkan_core.h")
else()
message(FATAL "Could not find NDK host directory for ${host_system_name_lower}")
endif()

if(NOT vulkan_core_h)
message(FATAL "vulkan_core.h not found, using default version")
endif()

file(READ "${vulkan_core_h}" header_content)
string(REGEX MATCH "VK_HEADER_VERSION ([0-9]+)" version_match "${header_content}")
if(version_match)
set(header_version_3 "${CMAKE_MATCH_1}")
else()
message(FATAL "Could not extract VK_HEADER_VERSION from vulkan_core.h, using default: ${vulkan_version}")
endif()

# Extract major.minor version from VK_HEADER_VERSION_COMPLETE for download URL
string(REGEX MATCH "VK_HEADER_VERSION_COMPLETE VK_MAKE_API_VERSION\\(([0-9]+), ([0-9]+), ([0-9]+)" version_match "${header_content}")
if(version_match)
set(major "${CMAKE_MATCH_2}")
set(minor "${CMAKE_MATCH_3}")
set(vulkan_version "${major}.${minor}.${header_version_3}" PARENT_SCOPE)
else()
message(FATAL "Could not extract major.minor version from vulkan_core.h, using default: ${vulkan_version}")
endif()
endfunction()
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
vcpkg_from_github(
OUT_SOURCE_PATH SOURCE_PATH
REPO tetherto/qvac-fabric-llm.cpp
REF 02807f4010f8e08f50216796374b65c339e2c9ab
SHA512 1818cc4dd008208480d4fedc8d3d7c6510065e912ae742eeba438dcb21f79eb7b40f6b37c0f69789a450e54dba75a3e32b8a1809b41edd99a9b5b840ccd4d4f5
)

vcpkg_check_features(
OUT_FEATURE_OPTIONS FEATURE_OPTIONS
FEATURES
force-profiler FORCE_GGML_VK_PERF_LOGGER
)

if (VCPKG_TARGET_IS_ANDROID)
# NDK only comes with C headers.
# Make sure C++ header exists, it will be used by ggml tensor library.
# Need to determine installed vulkan version and download correct headers
include(${CMAKE_CURRENT_LIST_DIR}/android-vulkan-version.cmake)
detect_ndk_vulkan_version()
message(STATUS "Using Vulkan C++ wrappers from version: ${vulkan_version}")
file(DOWNLOAD
"https://github.com/KhronosGroup/Vulkan-Headers/archive/refs/tags/v${vulkan_version}.tar.gz"
"${SOURCE_PATH}/vulkan-sdk-${vulkan_version}.tar.gz"
TLS_VERIFY ON
)

file(ARCHIVE_EXTRACT
INPUT "${SOURCE_PATH}/vulkan-sdk-${vulkan_version}.tar.gz"
DESTINATION "${SOURCE_PATH}"
PATTERNS "*.hpp"
)

file(RENAME
"${SOURCE_PATH}/Vulkan-Headers-${vulkan_version}"
"${SOURCE_PATH}/ggml/src/ggml-vulkan/vulkan_cpp_wrapper"
)
endif()

set(PLATFORM_OPTIONS)

if (VCPKG_TARGET_IS_OSX OR VCPKG_TARGET_IS_IOS)
list(APPEND PLATFORM_OPTIONS -DGGML_METAL=ON)
if (VCPKG_TARGET_IS_IOS)
list(APPEND PLATFORM_OPTIONS -DGGML_BLAS=OFF -DGGML_ACCELERATE=OFF)
endif()
else()
list(APPEND PLATFORM_OPTIONS -DGGML_VULKAN=ON)
endif()

if(VCPKG_TARGET_IS_ANDROID)
set(DL_BACKENDS ON)
list(APPEND PLATFORM_OPTIONS
-DGGML_BACKEND_DL=ON
-DGGML_CPU_ALL_VARIANTS=ON
-DGGML_CPU_REPACK=ON)
else()
set(DL_BACKENDS OFF)
endif()

if (VCPKG_TARGET_IS_ANDROID)
# Keep VK_KHR_cooperative_matrix and VK_NV_cooperative_matrix2 enabled so the
# Mali NaN workaround (qvac-fabric c79a8851 β€” dequant-to-F16 + F32 accumulation
# for TQ1/TQ2 on ARM) can take effect. With coopmat disabled, ctx->device->
# coopmat_support is false and the fix's branches are skipped.
# OpenCL stays enabled for Adreno (which doesn't depend on these toggles).
list(APPEND PLATFORM_OPTIONS -DGGML_OPENCL=ON)
endif()

vcpkg_cmake_configure(
SOURCE_PATH "${SOURCE_PATH}"
DISABLE_PARALLEL_CONFIGURE
OPTIONS
-DGGML_NATIVE=OFF
-DGGML_CCACHE=OFF
-DGGML_OPENMP=OFF
-DGGML_LLAMAFILE=OFF
-DLLAMA_MTMD=ON
-DLLAMA_CURL=OFF
-DLLAMA_BUILD_TESTS=OFF
-DLLAMA_BUILD_TOOLS=OFF
-DLLAMA_BUILD_EXAMPLES=OFF
-DLLAMA_BUILD_SERVER=OFF
-DLLAMA_ALL_WARNINGS=OFF
${PLATFORM_OPTIONS}
${FEATURE_OPTIONS}
)

vcpkg_cmake_install()
vcpkg_cmake_config_fixup(
PACKAGE_NAME llama)
vcpkg_cmake_config_fixup(
PACKAGE_NAME ggml)

vcpkg_copy_pdbs()
vcpkg_fixup_pkgconfig()

file(MAKE_DIRECTORY "${CURRENT_PACKAGES_DIR}/tools/${PORT}")
file(RENAME "${CURRENT_PACKAGES_DIR}/bin/convert_hf_to_gguf.py" "${CURRENT_PACKAGES_DIR}/tools/${PORT}/convert-hf-to-gguf.py")
file(INSTALL "${SOURCE_PATH}/gguf-py" DESTINATION "${CURRENT_PACKAGES_DIR}/tools/${PORT}")
file(RENAME "${CURRENT_PACKAGES_DIR}/bin/vulkan_profiling_analyzer.py" "${CURRENT_PACKAGES_DIR}/tools/${PORT}/vulkan_profiling_analyzer.py")

if (NOT VCPKG_BUILD_TYPE)
file(REMOVE "${CURRENT_PACKAGES_DIR}/debug/bin/convert_hf_to_gguf.py")
endif()

file(REMOVE_RECURSE "${CURRENT_PACKAGES_DIR}/debug/include")
file(REMOVE_RECURSE "${CURRENT_PACKAGES_DIR}/debug/share")

if (NOT DL_BACKENDS AND VCPKG_LIBRARY_LINKAGE MATCHES "static")
file(REMOVE_RECURSE "${CURRENT_PACKAGES_DIR}/bin")
file(REMOVE_RECURSE "${CURRENT_PACKAGES_DIR}/debug/bin")
endif()

vcpkg_install_copyright(FILE_LIST "${SOURCE_PATH}/LICENSE")
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
{
"name": "qvac-fabric",
"version": "7248.2.3",
"port-version": 1,
"description": "LLM inference in C/C++",
"homepage": "https://github.com/tetherto/qvac-fabric-llm.cpp",
"license": "MIT",
"dependencies": [
{
"name": "opencl",
"platform": "android"
},
{
"name": "vcpkg-cmake",
"host": true
},
{
"name": "vcpkg-cmake-config",
"host": true
}
],
"features": {
"force-profiler": {
"description": "Force vk performance logging in ggml"
}
}
}
4 changes: 4 additions & 0 deletions packages/qvac-lib-infer-llamacpp-llm/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,10 @@ configure_file(${VCPKG_INSTALLED_PATH}/share/qvac-lint-cpp/.clang-tidy

find_path(PICOJSON_INCLUDE_DIRS "picojson/picojson.h")
find_path(QVAC_LIB_INFERENCE_ADDON_CPP_INCLUDE_DIRS "qvac-lib-inference-addon-cpp/JsInterface.hpp")
# llama-targets.cmake transitively requires OpenSSL::SSL via cpp-httplib's
# IMPORTED interface. Make OpenSSL discoverable before find_package(llama)
# so the target chain resolves on local builds.
find_package(OpenSSL)
find_package(llama CONFIG REQUIRED)
# Required to call llama.cpp's `json_schema_to_grammar()` for per-request
# JSON-Schema β†’ GBNF conversion. The function signature lives in libcommon
Expand Down
Loading
Loading