LLAMA Compiling with Vulkan

LLAMA Compile + Vulkan

1. cmake -B build -DGGML_VULKAN=ON 
2. cmake -B build -DGGML_VULKAN=ON -DGGML_NATIVE=OFF -DGGML_CPU_ARM_ARCH=native

1. i see the following:

(llama-env) PS C:\llama-dev\llama.cpp> cmake -B build -DGGML_VULKAN=ON
-- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.26200.
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: AMD64
-- CMAKE_GENERATOR_PLATFORM:
-- Including CPU backend
-- x86 detected
-- Adding CPU backend variant ggml-cpu:
-- Vulkan found
-- GL_KHR_cooperative_matrix supported by glslc
-- GL_NV_cooperative_matrix2 supported by glslc
-- GL_EXT_integer_dot_product supported by glslc
-- Including Vulkan backend

1. cmake --build . --config Release
2. i see under \bin\Release\ggml-vulkan.dll
3. (llama-env) PS C:\llama-dev\llama.cpp\build\bin\Release> .\llama-server.exe -m "C:\AI LLMS\gemma-3-12b-it-Q4_K_M.gguf" 
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon RX 580 Series (AMD proprietary driver) | uma: 0 | fp16: 0 | warp size: 64 | shared memory: 32768 | int dot: 0 | matrix cores: none
error: invalid argument: LLMS\gemma-3-12b-it-Q4_K_M.gguf

is it only possible to use llama-server.exe? 

and not as normal like 

import llama_cpp
from llama_cpp import Llama


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLAMA Compiling with Vulkan #2006

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

LLAMA Compiling with Vulkan #2006

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions