Hi,
I can remove the punctuations in Kuromoji-ES plugin by setting "discard_punctuation": "true"
I'm wondering how can I get the same result with Kuromoji-Java?
For example, in Kuromoji-ES, 「浅草」駅 will be tokenized as
{
"tokens" : [
{
"token" : "浅草",
"start_offset" : 1,
"end_offset" : 3,
"type" : "word",
"position" : 0
},
{
"token" : "駅",
"start_offset" : 4,
"end_offset" : 5,
"type" : "word",
"position" : 1
}
]
}
Is there a same function with Kuromoji-Java to do so?
Hi,
I can remove the punctuations in Kuromoji-ES plugin by setting
"discard_punctuation": "true"I'm wondering how can I get the same result with Kuromoji-Java?
For example, in Kuromoji-ES,
「浅草」駅will be tokenized asIs there a same function with Kuromoji-Java to do so?