-
Notifications
You must be signed in to change notification settings - Fork 69
Description
@mihnita did some performance investigation in unicode-org/icu4x#7258, and copying buffers across FFI seems to be a cost.
JNI has GetStringChars and GetStringUTFChars, which produce a null-terminated UTF-16 and UTF-8 string respectively from a JVM String.
There are similar functions for arrays.
Both of these functions may or may not perform a copy (they tell you if they did, via the second outparam). It's unclear how frequent that is in popular JVM implementations.
It's likely these are a faster way to produce a Diplomat slice rather than using diplomat_alloc and manually copying.
Furthermore, there is GetStringCritical, which seems to be more likely to provide a direct pointer, but comes with the restriction of not allowing any JNI code interspersed (which means we cannot call callbacks, or hold on to the returned string past the duration of the current JNI call). This is not so useful for optimizing ICU4X large-string APIs like WordSegmenter (since they take a reference to the data past multiple FFI calls), but it may still be useful to investigate.