Qwen3.5 models perplexity with LM_scoring (with wiki2) #349
vince62s
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
27B and 35B-A3B are very similar in terms of perplexity.
With new cuda kernels A35B-A3B is faster but some benchmarks put i behind 27B in terms of quality.
Beta Was this translation helpful? Give feedback.
All reactions