huggingface transformers compat #191
edwardcapriolo
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The problem(s) we can run into
Our tokenizer DOES NOT strip the leading spaces but DOES " remove ##"
I have started working on a complete "PORT" of the tokenizers. As we aren't dealing with permuations of kwargs. Outside of the "it works" use cases we have it is very hard to cross verify anything. It seems like we need this, long running issues like the chinese characters and QWEN. To me to not we cant solve with "with hacks" small fixes here and there will become more complicated as we bring more models. The more we can do to make the interfaces like for like the more velocity we will get.
Beta Was this translation helpful? Give feedback.
All reactions