Skip to content

Kotlin: Consider using GetStringChars and GetStringUTFChars to avoid string copies over JNI #1005

@Manishearth

Description

@Manishearth

@mihnita did some performance investigation in unicode-org/icu4x#7258, and copying buffers across FFI seems to be a cost.

JNI has GetStringChars and GetStringUTFChars, which produce a null-terminated UTF-16 and UTF-8 string respectively from a JVM String.

There are similar functions for arrays.

Both of these functions may or may not perform a copy (they tell you if they did, via the second outparam). It's unclear how frequent that is in popular JVM implementations.

It's likely these are a faster way to produce a Diplomat slice rather than using diplomat_alloc and manually copying.

Furthermore, there is GetStringCritical, which seems to be more likely to provide a direct pointer, but comes with the restriction of not allowing any JNI code interspersed (which means we cannot call callbacks, or hold on to the returned string past the duration of the current JNI call). This is not so useful for optimizing ICU4X large-string APIs like WordSegmenter (since they take a reference to the data past multiple FFI calls), but it may still be useful to investigate.

cc @jcrist1 @sffc

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions