Please take this report with a big pinch of salt : I am not even a kuromoji user and I did not profile the code thoroughly.
In ViterbiBuilder, kuromoji uses an fst to search for all possible prefix of a given string that are within a dictionary (encoded as the fst).
The successive call to lookup however, restart from the rootnode of the fst. It would be advisable to get all of the prefix in a single browse of the fst.
The headroom is valuable, but not massive. Around 15% of the time is spent in Fst.lookup. One can hope to cut this bit in half.
Please take this report with a big pinch of salt : I am not even a kuromoji user and I did not profile the code thoroughly.
In ViterbiBuilder, kuromoji uses an fst to search for all possible prefix of a given string that are within a dictionary (encoded as the fst).
The successive call to lookup however, restart from the rootnode of the fst. It would be advisable to get all of the prefix in a single browse of the fst.
The headroom is valuable, but not massive. Around 15% of the time is spent in
Fst.lookup. One can hope to cut this bit in half.