GPU version build not using GPU

Hi Everyone, 

I am trying to build llama-node for GPU, I followed the guide in the readme https://llama-node.vercel.app/docs/cuda but the version of the llam-cpp I get from a manual build uses CPU not GPU. When I build llama-cpp directly in llama-sys folder using the following command: 

make clean && LLAMA_CUBLAS=1 make -j
It gives me perfectly fine GPU executable file which works no problem. 

Am I missing something?
Here is my full build commands:

 git clone https://github.com/Atome-FE/llama-node.git
 cd llama-node/
 rustup target add x86_64-unknown-linux-musl
 git submodule update --init --recursive
 pnpm install --ignore-scripts
 cd packages/llama-cpp/
 pnpm build:cuda 

Then I get libllama.so file in my ~/.llama-node which when used does not use GPU: Here my script to run it:

import { LLM } from "llama-node";
import { LLamaCpp } from "llama-node/dist/llm/llama-cpp.js";
import path from "path";
const model = path.resolve(process.cwd(), "~/CODE/models/vicuna-7b-v1.3.ggmlv3.q4_0.bin");
const llama = new LLM(LLamaCpp);
const config = {
    modelPath: model,
    enableLogging: true,
    nCtx: 1024,
    seed: 0,
    f16Kv: false,
    logitsAll: false,
    vocabOnly: false,
    useMlock: false,
    embedding: false,
    useMmap: true,
    nGpuLayers: 40
};
const template = `How do I train you to read my documents?`;
const prompt = `A chat between a user and an assistant.
USER: ${template}
ASSISTANT:`;
const params = {
    nThreads: 4,
    nTokPredict: 2048,
    topK: 40,
    topP: 0.1,
    temp: 0.2,
    repeatPenalty: 1,
    prompt,
};
const run = async () => {
    await llama.load(config);
    await llama.createCompletion(params, (response) => {
        process.stdout.write(response.token);
    });
};
run();

Any help appreciated


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPU version build not using GPU #114

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GPU version build not using GPU #114

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions