feat(java): row encoder supports custom types and collections #2243
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Extend Java Row Format to allow registering custom datatypes (e.g. UUID as Int128) and collection factories (e.g.
SortedSet<UUID>
asnew TreeSet<UUID>(customComparator)
)Additionally supports arrays of custom types e.g.
UUID[]
Since the type inference is in
fury-core
but I wanted to keep new features scoped tofury-format
, I had to add a small plugin interface to core so that format can add types dynamically without affecting existing core behavior.Related issues
#2208
Does this PR introduce any user-facing change?
The
Encoders
class has newregisterCustomCodec
andregisterCustomCollectionFactory
methods.All custom types are written with the existing protocol as embedded memory buffers just like any other field, but with a custom byte representation, so there should be no wire compatibility concerns.
Benchmark
There should be no change to performance in existing use cases. The code is carefully written to have no runtime impact if not used. Custom types are invoked via static methods or instance method on static final fields, which should be easily inlined by jit for minimum overhead.
Here is example generated code to help show this:
https://gist.github.com/stevenschlansker/ed7dae863e78d3c87e30bdea39fa8dea