[FLINK-38930][checkpoint] Filtering record before processing without spilling strategy by 1996fanrui · Pull Request #27783 · apache/flink

1996fanrui · 2026-03-18T16:44:59Z

This PR depends on #27782

What is the purpose of the change

[FLINK-38930][checkpoint] Filtering record before processing without spilling strategy

Brief change log

Core filtering mechanism for recovered channel state buffers:

ChannelStateFilteringHandler with per-gate GateFilterHandler
RecordFilterContext with VirtualChannelRecordFilterFactory
Partial data check in SequentialChannelStateReaderImpl
Fix RecordFilterContext for Union downscale scenario

Verifying this change

Tons of unit tests

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): no
The public API, i.e., is any changed class annotated with @Public(Evolving): no
The serializers: no
The runtime per-record code paths (performance sensitive):no
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
The S3 file system connector:no

Documentation

Does this pull request introduce a new feature? no

flinkbot · 2026-03-18T16:54:46Z

CI report:

997e3a3 Azure: SUCCESS
db2565f UNKNOWN

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot run azure re-run the last Azure build

pnowojski

Thanks! I've left a couple of comments from the first review pass

pnowojski · 2026-03-26T16:53:07Z

.../src/main/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateFilteringHandler.java

+         * Deserializes records from {@code sourceBuffer}, applies the virtual channel's record
+         * filter, and re-serializes the surviving records into new buffers.
+         */
+        List<Buffer> filterAndRewrite(


could you re-order methods in this class? Public first. Private either below all publics, or below the first usage?

pnowojski · 2026-03-26T16:54:30Z

.../src/main/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateFilteringHandler.java

+    /**
+     * Filters a recovered buffer from the specified virtual channel, returning new buffers
+     * containing only the records that belong to the current subtask.
+     *
+     * @return filtered buffers, possibly empty if all records were filtered out.
+     */
+    public List<Buffer> filterAndRewrite(
+            int gateIndex,
+            int oldSubtaskIndex,
+            int oldChannelIndex,
+            Buffer sourceBuffer,
+            BufferSupplier bufferSupplier)


Why does it return List from one single sourceBuffer? Could you explain this in the java doc? And how many Buffers can that be? If a lot, shouldn't this be an Iterator?

pnowojski · 2026-03-26T16:58:17Z

.../src/main/java/org/apache/flink/runtime/checkpoint/channel/RecoveredChannelStateHandler.java

+        // Extra retain: filterAndRewrite consumes one ref, caller's finally releases another.
+        buffer.retainBuffer();


nit: I think it would be slightly cleaner to call buffer.retainBuffer from the outside, and contract would be then that this method always takes over ownership of this buffer.

pnowojski · 2026-03-26T17:01:37Z

.../src/main/java/org/apache/flink/runtime/checkpoint/channel/RecoveredChannelStateHandler.java

+        } catch (Throwable t) {
+            // filterAndRewrite didn't consume the buffer, release the extra ref.
+            buffer.recycleBuffer();
+            throw t;
+        }


Hmm, that's a bit strange? It sounds like it's not clear who is owner of this buffer? There should be clean owner that's always responsible for cleaning up, no matter what.

pnowojski · 2026-03-26T17:40:01Z

...untime/src/main/java/org/apache/flink/streaming/runtime/io/recovery/RecordFilterContext.java

+    /**
+     * Checks whether unaligned checkpoint during recovery is enabled.
+     *
+     * @return {@code true} if enabled, {@code false} otherwise.
+     */
+    public boolean isUnalignedDuringRecoveryEnabled() {
+        return unalignedDuringRecoveryEnabled;
+    }


rename to isCheckpointingDuringRecoveryEnabled? And adjust the java doc:

Checks whether unaligned checkpointING during recovery is enabled.

pnowojski · 2026-03-26T17:41:12Z

.../main/java/org/apache/flink/runtime/checkpoint/channel/SequentialChannelStateReaderImpl.java

+            // Clean up filtering handler resources (e.g., temp files from
+            // SpillingAdaptiveSpanningRecordDeserializer) on both success and error paths
+            if (filteringHandler != null) {
+                filteringHandler.clear();


nit: make it closeable?

pnowojski · 2026-03-26T17:51:51Z

.../src/main/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateFilteringHandler.java

+            List<StreamElement> filteredElements = new ArrayList<>();
+
+            while (true) {
+                DeserializationResult result = vc.getNextRecord(deserializationDelegate);
+                if (result.isFullRecord()) {
+                    filteredElements.add(deserializationDelegate.getInstance());
+                }
+                if (result.isBufferConsumed()) {
+                    break;
+                }
+            }
+
+            return serializeToBuffers(filteredElements, bufferSupplier);


ditto about List in List<StreamElement> filteredElements. It would be safer to be iterative. Current implementation risks OOMs if deserialised records are using more memory than the serialised records. This is not very common, but could happen.

pnowojski · 2026-03-26T17:53:28Z

.../src/main/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateFilteringHandler.java

+                        resultBuffers.add(currentBuffer.retainBuffer());
+                    }
+                    currentBuffer.recycleBuffer();
+                    currentBuffer = bufferSupplier.requestBufferBlocking();


Is it safe to block here? 🤔 Can this lead to deadlocks? I think we were discussing this, but AFAIR this code works differently to what we were discussing offline (either using unpooled buffer or create two different pools, or filter records in-place without requesting new buffer)?

Good catch!

This is addressed in a follow-up commit in https://github.com/apache/flink/pull/27639/commits (FLINK-38544, f031ddf) by falling back to heap buffer when the buffer pool is insufficient.

…spilling strategy Core filtering mechanism for recovered channel state buffers: - ChannelStateFilteringHandler with per-gate GateFilterHandler - RecordFilterContext with VirtualChannelRecordFilterFactory - Partial data check in SequentialChannelStateReaderImpl - Fix RecordFilterContext for Union downscale scenario

1996fanrui force-pushed the 38930/filtering-record branch from 2b06750 to 997e3a3 Compare March 20, 2026 08:48

pnowojski reviewed Mar 26, 2026

View reviewed changes

1996fanrui force-pushed the 38930/filtering-record branch from 997e3a3 to db2565f Compare March 26, 2026 21:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-38930][checkpoint] Filtering record before processing without spilling strategy#27783

[FLINK-38930][checkpoint] Filtering record before processing without spilling strategy#27783
1996fanrui wants to merge 1 commit intoapache:masterfrom
1996fanrui:38930/filtering-record

1996fanrui commented Mar 18, 2026 •

edited

Loading

Uh oh!

flinkbot commented Mar 18, 2026 •

edited

Loading

Uh oh!

pnowojski left a comment

Uh oh!

pnowojski Mar 26, 2026

Uh oh!

pnowojski Mar 26, 2026

Uh oh!

pnowojski Mar 26, 2026

Uh oh!

pnowojski Mar 26, 2026

Uh oh!

pnowojski Mar 26, 2026

Uh oh!

pnowojski Mar 26, 2026

Uh oh!

pnowojski Mar 26, 2026

Uh oh!

pnowojski Mar 26, 2026

Uh oh!

1996fanrui Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		// Extra retain: filterAndRewrite consumes one ref, caller's finally releases another.
		buffer.retainBuffer();

Conversation

1996fanrui commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

Uh oh!

flinkbot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI report:

Uh oh!

pnowojski left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1996fanrui commented Mar 18, 2026 •

edited

Loading

flinkbot commented Mar 18, 2026 •

edited

Loading