gguf-model-support

Here are 11 public repositories matching this topic...

brontoguana / krasis

Krasis is a Hybrid LLM runtime which focuses on efficient running of larger models on consumer grade VRAM limited hardware

transformer inference-engine inference-optimization mixture-of-experts cpu-inference large-language-models gpu-inference llm-inference high-performance-inference hybrid-inference gguf-model-support llama-cpp-alternative

Updated May 23, 2026
C++

nareshis21 / Truelarge-RT

Star

Android inference engine running 20B+ parameter LLMs on 4GB-8GB RAM devices. Features proprietary Layer-by-Layer (LBL) streaming, zero-copy mmap loading, and native C++/Kotlin architecture.

android cpp kotlin-android jni layered-architecture inference-engine on-device-ai edgeai llm low-ram-usage llamacpp llm-inference gguf-model-support

Updated Feb 21, 2026
Kotlin

Mainframework / Quanta

Star

Convert and quantize llm models

Updated Dec 30, 2025
Python

Splinter is a successful advanced AI research project to cohabitate inference and semantic governance in L3 cache and memory lanes, while simultaneously providing an attempt at standardized local POSIX-friendly tooling as building blocks on top of the provided library.

lua cache ipc bloom-filter inference pubsub atomic-design epoll lock-free kv vectors gdelt-data atomics seqlock vector-search vector-database wasmedge llama-cpp gguf-model-support

Updated May 13, 2026
C

Chintanpatel24 / flint

Star

Make your digital brain inside your computer

markdown canvas memory workspace note-taking hacktoberfest openaiapi apis-support llama-cpp-python geminiapi gguf-model-support claudeapi local-vault agent-support electron-graph ollama-support

Updated May 25, 2026
TypeScript

mamei16 / MADLAD-400-WebUI

Star

A simple Gradio app for local translation using the GGUF versions of MADLAD-400

nlp translation machine-translation gradio llamacpp gguf gguf-model-support

Updated Dec 8, 2025
Python

Ozgur-al / local-rag-server

Star

Privacy-first Local RAG Server: Chat with PDF & DOCX using GGUF models via llama.cpp and Qdrant. A lightweight, standalone FastAPI server with a clean HTML UI. High-performance, fully offline document intelligence. No Ollama, no cloud, no API keys.

python document-search rag fastapi qdrant llm llama-cpp local-llm gguf offline-ai rag-pipeline rag-chatbot gguf-model-support

Updated Feb 24, 2026
Python

headlessripper / Nectar-X-Studio

Star

Nectar-X-Studio is a powerful, Local AI-Inferencing application that allows the user download, create, run agents and run large language models on their own machine. With no internet connection required, Nectar ensures privacy-first, high-performance inference using cutting-edge open-source models from Hugging Face, Ollama, and beyond.

ai ml ai-agents infrence stable-diffusion gguf-model-support

Updated Mar 20, 2026
Python

frinknet / gelli

Star

Containerized LLM for any use-case big or small

llama llm llmops llamacpp ggml llm-training gguf-model-support

Updated Mar 23, 2026
Shell

aTh1ef / tavily-gemma-researcher

Star

AI tool to help users research using local LLMs and automated web search.

python google gemma streamlit lm-studio langgraph private-llm tavily-api privacy-first-ai tavily-search next-gen-ai google-gemma-3-1b instruction-tuned-llm gguf-model-support

Updated Jun 9, 2025
Python

CreoOne / GGOOF

Star

GGUF file format for dotnet

csharp dotnet dotnet-core csharp-library gguf gguf-model-support

Updated Mar 27, 2026
C#

Improve this page

Add a description, image, and links to the gguf-model-support topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gguf-model-support topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gguf-model-support

Here are 11 public repositories matching this topic...

brontoguana / krasis

nareshis21 / Truelarge-RT

Mainframework / Quanta

splinterhq / libsplinter

Chintanpatel24 / flint

mamei16 / MADLAD-400-WebUI

Ozgur-al / local-rag-server

headlessripper / Nectar-X-Studio

frinknet / gelli

aTh1ef / tavily-gemma-researcher

CreoOne / GGOOF

Improve this page

Add this topic to your repo