Da big fork #186

edwardcapriolo · 2025-10-18T01:14:45Z

edwardcapriolo
Oct 18, 2025

So I am growing a rather large fork of Jlama here:

Some things since the fork

Classes like config really were being used to read the config from disk and being used as a quasi context
The static singletons are almost all gone
AbstractTensor and TensorCache have an odd coupling really TensorCache should be something like TensorAllocator I think Its a bit odd sthe cache makes the tensors via "get" then when you call Abstract.close() it calls TensorCache.release() which calls Tensor.Clear. Its a lot of back and forth.
Eg. = AbstractTensor.copyShape(); needed a reference to the static singleton TensorCache. so that it could create a subclass of itself which may or may not be in the class depending if its full or not... hard to understand.
I think there is a clever way to avoid clearing the tensors each time, IE if i ask for a shape and know I am going to overwrite it I don't need to 0 it out each time.
Anyway maybe look it over
I added seed support for more predicatable output
I still think the generate() method doesnt belong where it it. it is odly specific to a use case, I lean that abstractModel should be subclassed but I dont know.
I cleaned up lots of small things. One I can contribute back therese some dead code etc.
I think i want to get deliverance-core/jlamma core to the point where it actually is 4 separate submodules)
-tensor core
-tokenizers core
-generation core

the PerCorExecutor isnt really an executor and the min cores is 2. It is possible on a since core machine it will allocate 2x the actual cores.

edwardcapriolo · 2025-10-22T11:13:47Z

Item #4 is addressed here #187

0 replies