You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The evals look pretty decent and it would give a much needed quant size/speed buff to weaker systems. Some models might now even fit in vram.
Apparently there are conversion issues with mainline so it could be worth looking into. Probably my only chance to run kimi at better than Q1 if a 50% prune appears. Certainly looking forward to minified GLM already.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Anyone have success with pruned weights such as: https://huggingface.co/AesSedai/GLM-4.6-REAP-266B-A32B
The evals look pretty decent and it would give a much needed quant size/speed buff to weaker systems. Some models might now even fit in vram.
Apparently there are conversion issues with mainline so it could be worth looking into. Probably my only chance to run kimi at better than Q1 if a 50% prune appears. Certainly looking forward to minified GLM already.
Beta Was this translation helpful? Give feedback.
All reactions