[Client] Add Pojo Support #1992

polyzos · 2025-11-17T14:41:35Z

This addresses #1731

polyzos · 2025-11-18T07:45:12Z

FYI @wuchong .. This PR adds support for writing/scanning Pojos directly with the client, while keeping the API as is.
The only difference is that now it's typed.

When I find some time, I want to test the effect in terms of performance, between writing/scanning with Pojos vs InternalRows.

fluss-client/src/main/java/org/apache/fluss/client/converter/PojoType.java

fluss-client/src/main/java/org/apache/fluss/client/table/scanner/log/TypedLogScanner.java

polyzos · 2025-11-25T14:05:30Z

@leekeiabstraction . Indeed, we are looking at roughly 2x performance penalty.
I tested writing/scanning on both PK and Log tables with 10 million records.
I ran 5 iterations for each and calculated the average.

This is basically a trade-off for the users. Using InternalRow/GenericRow directly is way more efficient; however, this might come with some extra complexity and boilerplatete code.

For this reason, I want to give flexibility, probably leave the docs as is, with GenericRow being the go-to approach, but also add a section that Pojos can be used directly and maybe highlight this trade-off.

Moving forward I'm thinking that maybe it makes sense to add some helper classes that also derive the schema for the table from a Pojo.
Log Table

Primary Key Table

leekeiabstraction · 2025-11-25T14:46:23Z

Thank you @polyzos for addressing the commends and also providing the data from your tests! That gives a clear picture on the performance. Further response questions below:

This is basically a trade-off for the users. Using InternalRow/GenericRow directly is way more efficient; however, this might come with some extra complexity and boilerplatete code.
For this reason, I want to give flexibility, probably leave the docs as is, with GenericRow being the go-to approach, but also add a section that Pojos can be used directly and maybe highlight this trade-off.

IMO, this does not need to be a trade-off. I am curious if you have explored implementation complexity or cons around pushing the Pojo conversion down further so that conversion to/from InternalRow can be skipped altogether? From a quick look, it does seem like you have most of the interfaces updated to support it.

Moving forward I'm thinking that maybe it makes sense to add some helper classes that also derive the schema for the table from a Pojo.

Trying to understand, does this relate to performance or a separate thread on further changes that you're planning?

polyzos · 2025-11-25T14:54:18Z

@leekeiabstraction yes, it's a separate thread.
Can you provide some more context in terms of what you mean - push the conversion further down?
On the scan side, I see you mentioned optimization in the CompletedFetch.toScanRecord(LogRecord), that is something indeed that I didn't think of and might result in some further optimization.
Is there something also on the write side you are thinking of?

leekeiabstraction · 2025-11-25T15:06:41Z

Can you provide some more context in terms of what you mean - push the conversion further down?
On the scan side, I see you mentioned optimization in the CompletedFetch.toScanRecord(LogRecord), that is something indeed that I didn't think of and might result in some further optimization.

My aim is so that we can eliminate performance penalty by avoiding performing conversion twice. By "pushing conversion down", I mean you can do something like ~~moving the RowToPojoConverter method calls into LogScannerImpl/CompletedFetch.~~ refactoring so LogScannerImpl so that conversion is done directly to pojo. Strategy pattern could be useful here.

Is there something also on the write side you are thinking of?

Currently not, I'm very new to this code base but it's certainly worth exploring especially if we're also converting twice on the write side. Happy to have a look at the write side as well.

fluss-client/src/main/java/org/apache/fluss/client/table/writer/TypedAppendWriter.java

fluss-client/src/main/java/org/apache/fluss/client/table/writer/TypedUpsertWriter.java

wuchong

Thanks, @polyzos!

Overall, the pull request looks good. My only concern is the interface change. Could you please take a look at the new proposal?

Also, when you update the PR, please rebase your branch to keep the history clean.

fluss-client/src/main/java/org/apache/fluss/client/lookup/TypedLookuper.java

fluss-client/src/main/java/org/apache/fluss/client/lookup/Lookup.java

fluss-client/src/main/java/org/apache/fluss/client/table/scanner/Scan.java

fluss-client/src/main/java/org/apache/fluss/client/table/writer/Append.java

fluss-client/src/main/java/org/apache/fluss/client/table/writer/Upsert.java

polyzos · 2025-12-21T15:41:32Z

@wuchong thank you for all your comments.
I have already introduced typed classes, like

https://github.com/apache/fluss/pull/1992/changes#diff-463b3a6796c0042a681c569845f0abba145f454e7e40df447326bc212b544304

If you check the test i have also ensured backwards compatibility ... do you mean something different, or am I missing something?
I have also tested things locally to make sure nothing breaks

wuchong · 2025-12-22T03:31:44Z

@polyzos Yeah, I noticed those typed classes. However, typed classes are internal, only interfaces are visible to users.

My suggestion is to introduce dedicated typed interfaces (e.g., interface TypedLookuper<T>) that are separate from the existing Lookuper interface, rather than making Lookuper itself generic.

While turning Lookuper into a generic interface would be binary-compatible, it would introduce type erasure warnings in IDEs and clutter the public API. More importantly, for use cases involving InternalRow, the generic type T has no semantic meaning, so forcing a type parameter there adds unnecessary noise without benefit.

By keeping Lookuper and TypedLookuper<T> as distinct interfaces, we maintain clean separation of concerns:

Lookuper for low-level, type-agnostic access (e.g., InternalRow)
TypedLookuper<T> for high-level, type-safe lookups

This approach preserves backward compatibility, avoids IDE warnings, and aligns with how users actually interact with the API.

# Conflicts: # fluss-client/src/main/java/org/apache/fluss/client/lookup/Lookuper.java # fluss-client/src/main/java/org/apache/fluss/client/lookup/PrefixKeyLookuper.java # fluss-client/src/main/java/org/apache/fluss/client/lookup/PrimaryKeyLookuper.java

# Conflicts: # fluss-flink/fluss-flink-common/src/test/java/org/apache/fluss/flink/utils/FlussRowToFlinkRowConverterTest.java

polyzos · 2025-12-22T10:38:03Z

@wuchong I made the required changes.
Let me know if this approach resonates and works better

wuchong · 2025-12-23T06:08:33Z

Thank you @polyzos , I will take another look.

wuchong

Thanks @polyzos , I think this PR is already in a good shape. I left some minor comments.

fluss-client/src/main/java/org/apache/fluss/client/table/writer/TableAppend.java

wuchong · 2025-12-23T11:27:28Z

fluss-client/src/main/java/org/apache/fluss/client/table/writer/TableUpsert.java

+    @Override
+    public <T> TypedUpsertWriter<T> createTypedWriter(Class<T> pojoClass) {
+        UpsertWriterImpl delegate =
+                new UpsertWriterImpl(tablePath, tableInfo, targetColumns, writerClient);


Can simplify to just call createWriter().

wuchong · 2025-12-23T11:29:46Z

fluss-client/src/main/java/org/apache/fluss/client/table/writer/TypedUpsertWriterImpl.java

+    private final Class<T> pojoClass;
+    private final TableInfo tableInfo;
+    private final RowType tableSchema;
+    private final int[] targetColumns; // may be null


We can add @Nullable annotation to indicate it is nullable

wuchong · 2025-12-23T11:30:00Z

fluss-client/src/main/java/org/apache/fluss/client/table/writer/TypedUpsertWriterImpl.java

+        delegate.close();
+    }
+
+    private final Class<T> pojoClass;


Not used, can be removed.

fluss-client/src/test/java/org/apache/fluss/client/table/FlussTypedClientITCase.java

wuchong · 2025-12-23T11:48:54Z

fluss-client/src/test/java/org/apache/fluss/client/table/FlussTypedClientITCase.java

+
+            LookupResult lr = lookuper.lookup(new PLookupKey(1)).get();
+            AllTypesPojo one = rowConv.fromRow(lr.getSingletonRow());
+            assertThat(one.str).isEqualTo("s1");


assertThat(one).isEqualTo(newAllTypesPojo(1));

After adding equals and hashcode method to the AllTypesPojo class, we can simply assert the full record, this can check the full POJO record deserialization.

wuchong · 2025-12-23T11:52:42Z

fluss-client/src/test/java/org/apache/fluss/client/table/FlussTypedClientITCase.java

+            AllTypesPojo lookedUp =
+                    rowConv.fromRow(lookuper.lookup(new PLookupKey(1)).get().getSingletonRow());
+            assertThat(lookedUp.str).isEqualTo("second");
+            assertThat(lookedUp.dec).isEqualByComparingTo("99.99");


In order to test the partial update feature, we should assert the other fields are keep unchanged here.

wuchong · 2025-12-23T11:52:48Z

fluss-client/src/test/java/org/apache/fluss/client/table/FlussTypedClientITCase.java

+                TypedScanRecords<AllTypesPojo> recs = scanner.poll(Duration.ofSeconds(2));
+                for (TypedScanRecord<AllTypesPojo> r : recs) {
+                    if (r.getChangeType() == ChangeType.UPDATE_AFTER) {
+                        assertThat(r.getValue().str).isEqualTo("second");


wuchong · 2025-12-23T11:55:24Z

fluss-client/src/main/java/org/apache/fluss/client/table/scanner/log/ScanRecords.java

+    public static final ScanRecords empty() {
+        return new ScanRecords(Collections.emptyMap());
+    }


Not necessary? I still prefer the preivous implementation because it can avoid some small object overhead (GC).

wuchong · 2025-12-23T11:58:16Z

fluss-client/src/main/java/org/apache/fluss/client/table/writer/TableWriter.java

+    default void close() throws Exception {
+        // by default do nothing
+    }


It seems the newly introduced close() method doesn’t currently have any meaningful work to perform, as the writers don’t hold any resources at this stage. I suggest holding off on introducing it for now.

Typically, a close() method should either flush pending records or ensure all previously submitted requests have completed—otherwise, it may mislead users into expecting cleanup or finalization behavior that isn’t actually implemented. This introduces some complex to this PR.

polyzos marked this pull request as ready for review November 18, 2025 07:43

leekeiabstraction reviewed Nov 25, 2025

View reviewed changes

fluss-client/src/main/java/org/apache/fluss/client/converter/PojoType.java Outdated Show resolved Hide resolved

fluss-client/src/main/java/org/apache/fluss/client/table/scanner/log/TypedLogScanner.java Outdated Show resolved Hide resolved

polyzos force-pushed the java-client-pojo-support branch from d6722e7 to 24b65c2 Compare November 25, 2025 14:14

leekeiabstraction reviewed Nov 25, 2025

View reviewed changes

fluss-client/src/main/java/org/apache/fluss/client/table/writer/TypedAppendWriter.java Outdated Show resolved Hide resolved

fluss-client/src/main/java/org/apache/fluss/client/table/writer/TypedUpsertWriter.java Outdated Show resolved Hide resolved

polyzos force-pushed the java-client-pojo-support branch from b955ff5 to a274f5a Compare November 28, 2025 08:52

wuchong linked an issue Dec 21, 2025 that may be closed by this pull request

[Client] Java Client Add Pojo Support #1731

Closed

2 tasks

wuchong reviewed Dec 21, 2025

View reviewed changes

polyzos added 16 commits December 22, 2025 08:50

fix checkstyle violation

dceb71e

fix checkstyle violation

47e50f8

update tests

a24c32a

add end2end tests

7dc648d

add required parameterized types

4dc25d6

fix checkstyle violation

f291a0f

add missing types to flink module

8297380

patch tests

2a8a2b5

# Conflicts: # fluss-flink/fluss-flink-common/src/test/java/org/apache/fluss/flink/utils/FlussRowToFlinkRowConverterTest.java

improve test coverage

eabb51d

fix checkstyle violation

367461d

instantiate converters once

e06df33

Introduce Generics and Typed Classes

040fa90

fix checkstyle violation

f07a84b

fix checkstyle violation

a7cad6d

update tests

b22defd

polyzos added 10 commits December 22, 2025 08:52

add end2end tests

950ed3b

add required parameterized types

1fdbdfe

fix checkstyle violation

5393618

patch tests

b579655

improve test coverage

3d0e607

fix checkstyle violation

5f9a2b1

instantiate converters once

421a70e

revert message

5a891c0

fix test

e063083

fix checkstyle

3ea717c

polyzos force-pushed the java-client-pojo-support branch from f79319a to 3ea717c Compare December 22, 2025 07:06

polyzos added 6 commits December 22, 2025 10:11

refactor to typed apis

7dc4879

add TypedScanRecords

8b8a33d

revert unneeded changes

1f2e546

address comment for the lookuper

c1975b3

delete unneeded test

084cc4c

make the api consistent

604e6f6

wuchong reviewed Dec 23, 2025

View reviewed changes

address comments

99b0d06

wuchong approved these changes Dec 23, 2025

View reviewed changes

wuchong merged commit f731fc6 into apache:main Dec 23, 2025
5 checks passed

polyzos mentioned this pull request Dec 23, 2025

[Docs] Document the TypedAPI #2230

Open

2 tasks

Prajwal-banakar mentioned this pull request Dec 25, 2025

[Docs] Document the TypedAPI #2230 #2261

Open

[Client] Add Pojo Support #1992

[Client] Add Pojo Support #1992

Uh oh!

Conversation

polyzos commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

polyzos commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

polyzos commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leekeiabstraction commented Nov 25, 2025

Uh oh!

polyzos commented Nov 25, 2025

Uh oh!

leekeiabstraction commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wuchong left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

polyzos commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wuchong commented Dec 22, 2025

Uh oh!

polyzos commented Dec 22, 2025

Uh oh!

wuchong commented Dec 23, 2025

Uh oh!

wuchong left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

polyzos commented Nov 17, 2025 •

edited

Loading

polyzos commented Nov 18, 2025 •

edited

Loading

polyzos commented Nov 25, 2025 •

edited

Loading

leekeiabstraction commented Nov 25, 2025 •

edited

Loading

polyzos commented Dec 21, 2025 •

edited

Loading