Skip to content

Commit c6bbf2e

Browse files
Document Compact Serde caveat with pipelines [CTT-676]
There are known caveats to Compact Serialization used in Jet Pipelines; this PR updates documentation to address these. Fixes https://hazelcast.atlassian.net/browse/CTT-676
1 parent 6a541c1 commit c6bbf2e

File tree

2 files changed

+81
-0
lines changed

2 files changed

+81
-0
lines changed

docs/modules/pipelines/pages/serialization.adoc

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,81 @@ src.mapUsingService(serviceFactory,
134134
(formatter, tstamp) -> formatter.format(Instant.ofEpochMilli(tstamp)));
135135
```
136136

137+
== Using Compact Serialization with Pipelines
138+
139+
xref:serialization:compact-serialization.adoc[Compact serialization] provides an efficient,
140+
schema-based serialization mechanism for your data objects. While Compact serialization
141+
has the highest priority in Hazelcast's serialization service, there are important caveats
142+
to understand when using it with Jet pipelines.
143+
144+
=== Java Serializable Takes Precedence for Lambdas and Functions
145+
146+
Pipeline definitions, including lambda expressions and function objects, are serialized
147+
using standard Java serialization (`java.io.Serializable`). This is because the pipeline
148+
definition itself must be sent to cluster members before the Hazelcast serialization
149+
service is involved.
150+
151+
When an object implements `Serializable`, Java serialization is used directly and does
152+
not delegate to Hazelcast's serialization service. This means:
153+
154+
* All fields of a `Serializable` object must also be `Serializable`
155+
* Compact serializers registered for field types are not used during Java serialization
156+
* This applies even if the field's class has a registered `CompactSerializer`
157+
158+
=== Entry Processors and Captured Variables
159+
160+
This behavior is particularly relevant when using `Sinks.mapWithEntryProcessor()`.
161+
The `EntryProcessor` interface extends `Serializable`, so any custom entry processor
162+
and all its fields must be Java serializable.
163+
164+
Consider this example where `OrderStatus` has a registered Compact serializer:
165+
166+
```java
167+
// OrderStatus has a CompactSerializer registered, but this still fails
168+
// because MergeEntryProcessor implements Serializable (via EntryProcessor)
169+
public class MergeEntryProcessor implements EntryProcessor<String, Order, Order> {
170+
// This field must be Serializable, even though OrderStatus has a CompactSerializer
171+
private final OrderStatus newStatus;
172+
173+
public MergeEntryProcessor(OrderStatus newStatus) {
174+
this.newStatus = newStatus;
175+
}
176+
177+
@Override
178+
public Order process(Entry<String, Order> entry) {
179+
Order order = entry.getValue();
180+
order.setStatus(newStatus);
181+
entry.setValue(order);
182+
return order;
183+
}
184+
}
185+
```
186+
187+
In this case, `OrderStatus` must implement `Serializable` in addition to having a
188+
Compact serializer, because Java's `ObjectOutputStream` does not know about Hazelcast's
189+
Compact serialization.
190+
191+
The same applies to lambdas that capture Compact-serializable variables:
192+
193+
```java
194+
// orderStatus has a CompactSerializer, but capturing it in a lambda
195+
// requires it to also implement Serializable
196+
OrderStatus status = new OrderStatus("SHIPPED");
197+
pipeline.readFrom(source)
198+
.filter(order -> order.getStatus().equals(status));
199+
```
200+
201+
=== Workarounds
202+
203+
To use Compact-serializable objects with pipelines:
204+
205+
1. **Use service factories**: For non-serializable dependencies, use `mapUsingService()`
206+
to create objects on the target member rather than capturing them in lambdas.
207+
208+
2. **Implement both interfaces**: Have your classes implement `Serializable` in addition
209+
to registering a Compact serializer. The Compact serializer is still used when
210+
Hazelcast serializes objects for storage in maps or transmission between members.
211+
137212
== Serialization of Data Types
138213

139214
The objects you store in xref:data-structures:distributed-data-structures.adoc[Hazelcast data structures] must be serializable.

docs/modules/serialization/pages/compact-serialization.adoc

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1663,6 +1663,12 @@ public class Employee implements Serializable {
16631663
}
16641664
----
16651665

1666+
1667+
NOTE: When using Jet pipelines, lambda expressions and certain pipeline components (such as
1668+
`EntryProcessor` implementations) are serialized using Java serialization, which does not
1669+
delegate to Hazelcast's serialization service. In these cases, captured objects must
1670+
implement `Serializable` even if they have Compact serializers registered. For details,
1671+
see xref:pipelines:serialization.adoc#using-compact-serialization-with-pipelines[Using Compact Serialization with Pipelines].
16661672
== Compact Serialization Binary Specification
16671673

16681674
The binary specification of compact serialization is publicly available at xref:ROOT:compact-binary-specification.adoc[this page].

0 commit comments

Comments
 (0)