You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Wannaphong Phatthiyaphaibun edited this page Dec 14, 2020
·
3 revisions
newmm is a code name for The next maximal matching engine on PyThaiNLP. (It's not real name of word tokenizer engine.) It is a default of pythainlp.word_tokenize. Now, newmm is onecut engine.
newmm version
multi_cut (PyThaiNLP 1.4 - 1.5): Thai word segmentation with maximum matching. The original source code is from Korakot Chaovavanich. Now, It's mm engine in PyThaiNLP
onecut (PyThaiNLP 1.6 - Now): Dictionary-based maximal matching word segmentation, constrained with Thai Character Cluster (TCC) boundaries. created by Korakot Chaovavanich