Skip to content

new CLI experience #904

@galo

Description

@galo

In ggml-org/llama.cpp#17824 llama-cpp a new CLI experience reusing the llama-server infrastructure has be created, and deprecated the previous implementation.

I created a Rust implementation here https://github.com/galo/llama-cpp-rs/tree/main/examples/cli. This is interesting because it allows for some types of application the capability of directly reusing the llama-server features - ex: speculative decoding - , as well as the same parity with the rest of llama.cpp features. For this I simply exported new bindings and safe Rust implementation of the server components. The CLI is an example of how to use this infra,

Take a look and let me know if this is interesting, I have not done extensive texting - i.e. did no test vulkan/cuda/etc backend.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions