Replies: 1 comment 1 reply
-
Technically, yes, but there isn't any need for it. The distillation model u listed is a Dense structure, and using KT does not allow for any degree of acceleration.You can try the qwen3 family of MoE small-scale models, or simply switch to a different inference framework such as llama.cpp, vllm, etc.
No. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I do have download a full set of Deepseek-R1-Q4, and it works fine on my machine nicknamed "Fishbowl". There's a question however: the model is somewhat too big, not elegant enough for daily tasks.
So I wonder if I could load a distilled model from Deepseek like this, which is light enough even for my 6-year-old laptop to port with. And there's another question:
Is KTransformer supports regular .safetensors model?
for my knowledge, the answer is no - the arguments not includes one to specify you model path, only GGUFs instead. So any ideas?
Beta Was this translation helpful? Give feedback.
All reactions