Skip to content

Conversation

muhamadazmy
Copy link
Contributor

@muhamadazmy muhamadazmy commented Oct 2, 2025

[storage] Draft for Envelope v2

Summary

  • Separate header, keys, and other envelope related
    information from the actual payload of the envelope
  • Ability to decode envelope header without decoding the full envelope
  • Support different encoding for the payload (records) (flexbuffers, or bilrost)
  • Delay decoding of payload until needed

TODO:

  • Decode v1 envelope into v2
  • Use Envelope v2 in code

Stack created with Sapling. Best reviewed with ReviewStack.

@muhamadazmy muhamadazmy force-pushed the pr3853 branch 4 times, most recently from 75eea1e to 7464a02 Compare October 6, 2025 09:29
@muhamadazmy
Copy link
Contributor Author

@tillrohrmann in case you want to take a look the interesting parts are in the v2.rs file in this PR

}

#[derive(Debug, Clone, Copy, bilrost::Message)]
pub struct MessageIndexRecrod {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: Record.

{
$crate::storage::decode::decode_serde(buf, kind).map_err(|err| {
::tracing::error!(%err, "{} decode failure (decoding {})", kind, stringify!($name));
$crate::storage::tracing::error!(%err, "{} decode failure (decoding {})", kind, stringify!($name));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not quite sure why is change needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this macro in a crate that doesn't have tracing as a dependency will fail. I had this issue when i was trying to use the macro in the wal-protocol crate which does not depend on tracing.

So I had 2 options:

  • add tracing as a dependency to the crate. Only for the sake of this macro. The dependency will not be "used" in the crate anywhere else.
  • use a better macro hygiene and use tracing from the declaring crate. Which what is done here.

This way users of the macros will not need to pull extra dependencies (tracing in this case)

&self,
buf: &mut ::bytes::BytesMut,
) -> Result<(), $crate::storage::StorageEncodeError> {
bytes::BufMut::put(buf, $crate::storage::encode::encode_bilrost(self));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This potentially is better served if we pass the BufMut implementation to the encoder, perhaps by adding another variant (encode_bilrost_buf(buf)). Albeit, it's hard to tell which approach would yield better performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all bilrost encode variants (for example encode_contiguous which is used here) can make use of the BufMut. But we can pass it anyway and leave the implementation of the encode_bilrost function to decide what to do.

pub index: MessageIndex,
}

bilrost_storage_encode_decode!(MessageIndexRecrod);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the type should be named after the wal command (i.e. TruncateOutboxRequest) which btw needs to contain the PartitionId.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree! As for the partition-id this record message was mapped 1:1 from v1 without much thought about the payload it carries. I think the reason it doesn't have the partition id is that that's part of the envelope Header (destination)

#[bilrost(3)]
dedup: Dedup,
#[bilrost(4)]
keys: Keys,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is actually needed. On the read path, we ship the keys directly from bifrost (one layer up from the envelope). On the write-path, the appender needs the input record to implement HasRecordKeys which could be implemented on a wrapper type used as input to the appender. Essentially making the input type different than output type to save a few bytes on each record.

impl WithPartitionKey for Header {
fn partition_key(&self) -> PartitionKey {
match self.destination {
Destination::None => unimplemented!("expect destinationt to be set"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo @ destinationt

#[bilrost(1)]
source: Source,
#[bilrost(2)]
destination: Destination,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

destination is potentially unnecessary unless you see a clear use-case for it.

{
match kind {
StorageCodecKind::FlexbuffersSerde => {
todo!("implement loading from envelop V1")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
todo!("implement loading from envelop V1")
todo!("implement loading from envelope V1")

record! {
@name=PurgeJournal,
@kind=RecordKind::PurgeJournal,
@payload=PurgeInvocationRequest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it true that it has the exact same payload as PurgeInvocation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what v1 does. Yes.

# Summary
- Separate header, keys, and other envelope related
information from the actual payload of the envelope
- Ability to decode envelope header without decoding the full envelope
- Support different encoding for the payload (records) (flexbuffers, or bilrost)
- Delay decoding of payload until needed

# TODO:
- [ ] Decode v1 envelope into v2
- [ ] Use Envelope v2 in code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants