GitHub - mfrederico/kblam-ollama: KBLaM using Ollama

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
kblam_ollama		kblam_ollama
.gitignore		.gitignore
LICENSE		LICENSE
README		README
kblam-ollama.py		kblam-ollama.py
main.py		main.py
setup.py		setup.py

Repository files navigation

--
Wanting to make this accessible to a wider audience of ollama users, I've attempted a claude+ollama version of this based off anthropic reading/interpreting the whitepaper (and some futzing around): 
It seems like it does the trick for training, yet feels like ollama does its standard "fine-tuning" and creation of a checkpoints file instead of the rectangular "attention";  It seems more like a KVP lookup than anything.
This leads me to believe that in order for this to actually be implemented properly (for ollama / llama.cpp), I'd have to change how models are run in llama.cpp in order to take advantage of the actual "rectangular attention" concept as outlined in the original whitepaper.

Using llama3.2 (3.2B) on a 3060 for the whole shebang.
-----------------------------------------
*Based off of microsoft kblam:*
updated to use ollama
--
https://www.microsoft.com/en-us/research/blog/introducing-kblam-bringing-plug-and-play-external-knowledge-to-llms/?utm_source=kblam.ai&utm_medium=website
--
Prerequisites
You'll need to install the following packages:
torch
sentence-transformers
requests
numpy
--
Run the training mode first: python kblam-ollama.py --mode train
Then you can query it: python kblam-ollama.py --mode query
--
Example Query: