[FLINK-36235][Stream] Ignore emitting null rowData when deserialized message fails#124
[FLINK-36235][Stream] Ignore emitting null rowData when deserialized message fails#124arvindKandpal-ksolves wants to merge 1 commit into
Conversation
…deserialized message failed
|
Thanks for opening this pull request! Please check out our contributing guidelines. (https://flink.apache.org/contributing/how-to-contribute.html) |
|
Hi @featzhang , Can you review this Patch ? |
|
The change itself is correct and matches the default
|
| } | ||
|
|
||
| /** Collector that records every {@link #collect} invocation, including nulls. */ | ||
| private static class CountingCollector<T> implements Collector<T> { |
There was a problem hiding this comment.
Nit: org.apache.flink.api.common.functions.util.ListCollector would do the same job (collect into a List<T>) without introducing a new test helper. Not blocking.
Purpose of the change
This PR fixes FLINK-36235.
Currently, when a message fails to deserialize and returns
null, thePulsarDeserializationSchemaWrapperemits thisnullvalue downstream. This violates Flink'sDeserializationSchemacontract (where anullreturn value means "drop this record") and can causeNullPointerExceptions in downstream operators. This PR adds a simple null-check guard to safely drop these corrupted or null records.Brief change log
if (instance != null)check before emitting the record inPulsarDeserializationSchemaWrapper#deserialize.wrapperDropsNullDeserializedRecordand aCountingCollectorhelper class inPulsarDeserializationSchemaTestto verify that null records are correctly dropped without errors.Verifying this change
This change added tests and can be verified as follows:
PulsarDeserializationSchemaTest.javato explicitly test and verify the filtering behavior when the inner deserializer returnsnull.mvn -pl flink-connector-pulsar test.Significant changes
@Public(Evolving))