diff --git a/pip/pip-394.md b/pip/pip-394.md new file mode 100644 index 0000000000000..363758607dd4a --- /dev/null +++ b/pip/pip-394.md @@ -0,0 +1,105 @@ +# PIP-394: Add two interfaces `CursorMetadataSerializerProvider` and `CursorMetadataDeSerializerProvider` to support newer of customized cursor metadata serializations + +# Background knowledge + +**1. What does cursor metadata contain** + +- cursor properties. +- entry id that indicates the latest persist cursor metadata into. +- information of individual acknowledged messages, we call it `individualDeletedMessages`. +- information of individual acknowledged batched messages, we call it `batchedEntryDeletionIndexInfo`. + +**2. The improvements we did for the persistence for cursor metadata** +- https://github.com/apache/pulsar/pull/758: skip to information that over the max limitation of max ranges to persist. +- https://github.com/apache/pulsar/issues/14529: compress the info when persisting. +- https://github.com/apache/pulsar/pull/9292: add a new compression strategy: change Range Objects to `long[]`. + +# Motivation + +**Issue-1: Compatible of improvements** + +- The third improvement was contributed with `release:4.0`, which is a new LTS version. + - It changed the default implementation of serialization that contains https://github.com/apache/pulsar/pull/9292. +- Users can not rollback to `3.0.x` once upgraded to `4.0.x` because `release:3.0.x` does not contain the deserialization that introduced by https://github.com/apache/pulsar/pull/9292. + +**Issue-2: Frequently Young GC relates to the cursor metadata persistence if there are too many active subscriptions in a broker, even if we did so many improvements** + +`individualDeletedMessages` and `batchedEntryDeletionIndexInfo` often is the largest attributes of the metadata. They are serialized to a proto data when being persisted. But we can not recycle the object which typed proto due to it is immutable. + +![375661781-51d5bd6d-f5a1-48d7-921a-975875fe8bed](https://github.com/user-attachments/assets/dd1eb135-7dee-4dd1-84ba-994618a8198e) + + +# Goals + +- Guarantee compatability for rollback from `4.0.x` to `3.0.x`. + - This PIP will be cherry-picked into `branch-3.0` and `branch-3.3`. +- Support customized cursor metadata serializer to improve the issues users encountered, such as **Issue-1** in the Motivation. + +# High Level Design + +### Design + +- We call the serialization that implemented before `4.0.0` `V1`, and call after the https://github.com/apache/pulsar/pull/9292 `v2`. +- Add all version of serialization into `branch-3.0`. + - Set the default value of `3.0.x` is `V1`, which is the same as the current status. + - Set the default value of `4.0.x` is `V1`, which is the same as the current status. +- Add two interfaces `CursorMetadataSerializerProvider` and `CursorMetadataDeSerializerProvider` to support newer of customized cursor metadata serializations. + +### Public API + +**CursorMetadataSerializerProvider.java** +```java +CursorMetadataSerializer newProvider(Name, PulsarService); +``` + +**CursorMetadataDeSerializerProvider.java** +```java +CursorMetadataDeserializer newProvider(Name, PulsarService); +``` + +**CursorMetadataSerializer.java** +```java +ManagedCursorInfo serialize(Position markDeletePosition, + Map properties, + LongPairRangeSet individualDeletedMessages, + Map batchDeletedIndexes); +``` + +**CursorMetadataDeserializer.java** +```java +ManagedCursorInfo deserialize(ByteBuf data); +``` + +### Public-facing Changes & Binary protocol +- If you used your customized `CursorMetadataSerializer`, it may break the tools who will read cursor ZK node, such as the tool `pulsar-managed-ledger-admin`. + +### Configuration + +**broker.conf** +```properties +cursorMetadataSerializerProvider=V2 +cursorMetadataDeserializerProvider=V1,V2 +``` + +### InScope and out of Scope + +This PIP will only add the interfaces named `CursorMetadataSerializerProvider` and `CursorMetadataDeSerializerProvider`, the implementations other than `V1` and `V2` will not be provided. + +# Backward & Forward Compatibility + +## Upgrade + +Nothing to do. + +## Downgrade / Rollback + +- I will cherry-pick this PIP into `branch-3.0` and `branch-3.3`. +- Since https://github.com/apache/pulsar/pull/9292 changed the cursor metadata serialization. Once you upgraded to `4.0.x` from a lower version, you can only downgrade to the version that contains the current PIP. + +# Links + + +* Mailing List discussion thread: https://lists.apache.org/thread/xy1prwcv4wdoobphcgloj7s5gxy05qq3 +* Mailing List voting thread: https://lists.apache.org/thread/x8bf9hvk1pvo0dl0q3mcjh08wg90s89k