-
Notifications
You must be signed in to change notification settings - Fork 3.4k
feature request: working draft for cloudevents spec #8768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Co-authored-by: Doug Davis <[email protected]>
| datacontenttype: string; | ||
| dataschema: string; | ||
| subject: string; | ||
| time: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are these optional field serialized? For example, if "subject" is missing, is there still some "subject" type of entry there? Does the word "subject" actually appear in the serialization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
======================================================================
FLATBUFFERS OPTIONAL FIELD SERIALIZATION TEST
======================================================================
Question: How are optional fields serialized?
Specifically: Does 'subject' appear in the binary when omitted?
======================================================================
CloudEvent WITH 'subject' field
======================================================================
Total size: 144 bytes
Hex dump:
0000: 18 00 00 00 00 00 12 00 18 00 14 00 10 00 0c 00 ................
0010: 08 00 00 00 00 00 04 00 12 00 00 00 14 00 00 00 ................
0020: 20 00 00 00 34 00 00 00 38 00 00 00 54 00 00 00 ...4...8...T...
0030: 0a 00 00 00 6d 79 2d 73 75 62 6a 65 63 74 00 00 ....my-subject..
0040: 11 00 00 00 63 6f 6d 2e 65 78 61 6d 70 6c 65 2e ....com.example.
0050: 65 76 65 6e 74 00 00 00 03 00 00 00 31 2e 30 00 event.......1.0.
0060: 1a 00 00 00 68 74 74 70 73 3a 2f 2f 65 78 61 6d ....https://exam
0070: 70 6c 65 2e 63 6f 6d 2f 73 6f 75 72 63 65 00 00 ple.com/source..
0080: 09 00 00 00 65 76 65 6e 74 2d 31 32 33 00 00 00 ....event-123...
The word 'subject' APPEARS in the binary data
Position: 55
----------------------------------------------------------------------
Structure Analysis: WITH subject
----------------------------------------------------------------------
Root table offset (absolute from start): 24
Table starts at: 24
VTable relative offset: 18
VTable starts at: 6
VTable size: 18 bytes
Object inline size: 24 bytes
Number of field entries: 7
Field Required? Offset (rel) Absolute Offset Present?
--------------------------------------------------------------------------------
id True 20 44 True
source True 16 40 True
specversion True 12 36 True
type True 8 32 True
datacontenttype False 0 - False
dataschema False 0 - False
subject False 4 28 True
time False - - NO
extensions False - - NO
data False - - NO
======================================================================
CloudEvent WITHOUT 'subject' field
======================================================================
Total size: 116 bytes
Hex dump:
0000: 10 00 00 00 0c 00 14 00 10 00 0c 00 08 00 04 00 ................
0010: 0c 00 00 00 10 00 00 00 24 00 00 00 28 00 00 00 ........$...(...
0020: 44 00 00 00 11 00 00 00 63 6f 6d 2e 65 78 61 6d D.......com.exam
0030: 70 6c 65 2e 65 76 65 6e 74 00 00 00 03 00 00 00 ple.event.......
0040: 31 2e 30 00 1a 00 00 00 68 74 74 70 73 3a 2f 2f 1.0.....https://
0050: 65 78 61 6d 70 6c 65 2e 63 6f 6d 2f 73 6f 75 72 example.com/sour
0060: 63 65 00 00 09 00 00 00 65 76 65 6e 74 2d 34 35 ce......event-45
0070: 36 00 00 00 6...
✓ The word 'subject' does NOT appear in the binary data,
HOWEVER because of the structure of the vtable
readers are able to determine that is the case.
----------------------------------------------------------------------
Structure Analysis: WITHOUT subject
----------------------------------------------------------------------
Root table offset (absolute from start): 16
Table starts at: 16
VTable relative offset: 12
VTable starts at: 4
VTable size: 12 bytes
Object inline size: 20 bytes
Number of field entries: 4
Field Required? Offset (rel) Absolute Offset Present?
--------------------------------------------------------------------------------
id True 16 32 True
source True 12 28 True
specversion True 8 24 True
type True 4 20 True
datacontenttype False - - NO
dataschema False - - NO
subject False - - NO
time False - - NO
extensions False - - NO
data False - - NO
======================================================================
COMPARISON & CONCLUSIONS
======================================================================
Size with subject: 144 bytes
Size without subject: 116 bytes
Difference: 28 bytes
======================================================================
VERIFICATION: Reading back the data
======================================================================
Event WITH subject:
ID: event-123
Subject: my-subject
Event WITHOUT subject:
ID: event-456
Subject: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reproduce the test can be found in this gist:
https://gist.github.com/allen-munsch/295ab0b944ae0c2816896945e3f168a2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. You may have answered this during the previous call, but how do receivers get the fbs/schema files? Are they just known in advance or are they shared via some other mechanism?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think cloudevents_spec.fbs would need to be known in advance
dataschema is kinda confusing, cause its actually a URI, not just a string
I see 2 approaches for datacontenttype of application/cloudevents+flatbuffers,
- pre sharing the
dataschemaand verifying via the dataschema URI - dynamically compiling the
dataschemaURI so that thedatafield can be read
| FlatBuffers achieves forward and backward compatibility through an offset-based vtable (virtual table) system: | ||
|
|
||
| - **Forward compatibility**: Readers using older schemas can read data written with newer schemas by ignoring unknown fields | ||
| - **Backward compatibility**: Readers using newer schemas can read data written with older schemas by treating missing fields as unset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't help but wonder if extensions could be handled in a similar way to how these compatibility rules work. Meaning, what if we treat the presence (and serialization) of an extension as if it were a "newer schema" and then Readers can choose to process the extension if they know about it, or ignore it if they don't. Then we don't need a special "extensions" table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this mean the cloud event envelope would be tied to each extension type, it might require re-compiling the envelope?
Is extension meant as arbitrary bucket?
I'm confused a bit, here's how I interpreted what you wrote.
cloudevents_spec.fbs envelope would stay the same across events:
{
extensions: {blah: asdf}
}
Where cloudevents.asdf.fbs has data in the envelope, like this?
{
blah: asdf
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the example here: https://github.com/cloudevents/spec/blob/main/cloudevents/spec.md#example
notice that the extension attributes appear just like all other attributes. That's key design point of CE.
While not all protocols support that (e.g. proto and apparently fb), we try when we can.
In this case, a "spec defined" fbs file could do one of two things:
1 - define just the schema for the spec defined attributes - so no extensions allowed
2 - define a bucket for extensions, like you've done - but then violate our design pattern.
In a previous comment I asked about how receivers get the fbs files. If they are expected to "just have it", then (if I'm understanding things correctly), they'll skip over any unknown fields (extensions) automatically. Which may actually be ok if they don't care about them. However, if they know about certain extensions, then they could use a newer fbs file that defines those - assuming there's only one set of extensions they might receive. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a strongly typed/defined environment where both sender and receivers MUST have a prior knowledge of each other's schema, then I think it would be better to just create the fbs file with the extensions as top-level fields. But if you're in a more dynamic world then I think the "bucket" you've defined might be the only choice.
If I'm correct - the question would then be... is that too complicated to mention this choice? or is it better to just have one proposed solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we talked a bit about this on today's call and we're leaning towards what are have here... define a bucket for extensions. While the other approach is possible, it'll probably lead to more confusion.
If you get a chance, can you fix the PR in our repo? The text still isn't quite right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will take a look.
I should have gotten this to you sooner, but I think the flat on root approach would be possible like this:
// cloudevents.fbs
namespace io.cloudevents;
table CloudEvent {
// ---- REQUIRED ATTRIBUTES ----
id: string (id: 0, required);
source: string (id: 1, required);
specversion: string (id: 2, required);
type: string (id: 3, required);
// ---- OPTIONAL ATTRIBUTES ----
datacontenttype: string (id: 4);
dataschema: string (id: 5);
subject: string (id: 6);
time: string (id: 7);
data: [ubyte] (id: 8);
// ---- Example APPEND-ONLY FIELDS ----
// ---- MUST be appended, field order matters here
// acme: string (id: 9);
// foo: string (id: 10);
// bar: ulong (id: 11 deprecated);
// baz: string (id: 12);
// foobar: string (id: 13);
// foobaz: string (id: 14);
}
root_type CloudEvent;
However, if the consensus is extensions, then that is okay too, personally I liked your suggestion to align on CE spec, cause flatbuffer certainly would support it, I think.
Co-authored-by: Doug Davis <[email protected]>
Signed-off-by: allen-munsch <[email protected]>
Signed-off-by: allen-munsch <[email protected]>
The following is a draft PR to discuss adding a CNCF CloudEvents specification for flatbuffers.
References:
My original use case was to use flatbuffers in a zero copy pass through proxy in a low latency environment.