-
Notifications
You must be signed in to change notification settings - Fork 61
Open
Description
Thanks for this library. It looks fantastic.
However, it appears that the current implementation does not support any characters outside of the ASCII 0-127 range. Specifically, this condition in EdgeBag.get(char c) seems to trigger if a character with code > 127 appears in the input text:
public Edge get(char c) {
if (c != (char) (byte) c) {
throw new IllegalArgumentException("Illegal input character " + c + ".");
}
...I am happy to dig in and try and implement support for at least the normal Java char range of characters, but before I do I was wondering if there is any inherent reason for the current limitation?
My application that I am considering this library for is part of search function over a large text index, and I need to support multiple languages most of which use characters outside the range currently supported.
Metadata
Metadata
Assignees
Labels
No labels