Skip to content

LLAMA Compiling with Vulkan #2006

Open
Open
@LaKanDoR

Description

@LaKanDoR

LLAMA Compile + Vulkan

  1. cmake -B build -DGGML_VULKAN=ON

  2. cmake -B build -DGGML_VULKAN=ON -DGGML_NATIVE=OFF -DGGML_CPU_ARM_ARCH=native

  3. i see the following:

(llama-env) PS C:\llama-dev\llama.cpp> cmake -B build -DGGML_VULKAN=ON
-- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.26200.
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: AMD64
-- CMAKE_GENERATOR_PLATFORM:
-- Including CPU backend
-- x86 detected
-- Adding CPU backend variant ggml-cpu:
-- Vulkan found
-- GL_KHR_cooperative_matrix supported by glslc
-- GL_NV_cooperative_matrix2 supported by glslc
-- GL_EXT_integer_dot_product supported by glslc
-- Including Vulkan backend

  1. cmake --build . --config Release
  2. i see under \bin\Release\ggml-vulkan.dll
  3. (llama-env) PS C:\llama-dev\llama.cpp\build\bin\Release> .\llama-server.exe -m "C:\AI LLMS\gemma-3-12b-it-Q4_K_M.gguf"
    ggml_vulkan: Found 1 Vulkan devices:
    ggml_vulkan: 0 = Radeon RX 580 Series (AMD proprietary driver) | uma: 0 | fp16: 0 | warp size: 64 | shared memory: 32768 | int dot: 0 | matrix cores: none
    error: invalid argument: LLMS\gemma-3-12b-it-Q4_K_M.gguf

is it only possible to use llama-server.exe?

and not as normal like

import llama_cpp
from llama_cpp import Llama

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions