-
Notifications
You must be signed in to change notification settings - Fork 3.4k
feature request: working draft for cloudevents spec #8768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 1 commit
9e68931
fa1f926
3a7e3c7
b308e0c
b1258be
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,274 @@ | ||
| # FlatBuffers Event Format for CloudEvents - Version 1.0.0-wip | ||
|
|
||
| ## Abstract | ||
|
|
||
| The FlatBuffers Format for CloudEvents defines how events are expressed in the FlatBuffers Schema Language. | ||
|
|
||
| ## Table of Contents | ||
|
|
||
| 1. [Introduction](#1-introduction) | ||
| 2. [Attributes](#2-attributes) | ||
| 3. [Data](#3-data) | ||
| 4. [Transport](#4-transport) | ||
| 5. [Schema Evolution](#5-schema-evolution) | ||
| 6. [Examples](#6-examples) | ||
|
|
||
| ## 1. Introduction | ||
|
|
||
| CloudEvents is a standardized and protocol-agnostic definition of the structure and metadata description of events. This specification defines how CloudEvents are to be represented using FlatBuffers. | ||
|
|
||
| The Attributes section describes the naming conventions and data type mappings for CloudEvents attributes. | ||
|
|
||
| This specification does not define an envelope format. The FlatBuffers type system's intent is primarily to provide a consistent type system for structured data serialization. | ||
|
|
||
| The FlatBuffers event format does not currently define a batch mode format. | ||
|
|
||
| ### 1.1. Conformance | ||
|
|
||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119. | ||
|
|
||
| ## 2. Attributes | ||
|
|
||
| This section defines how CloudEvents attributes are mapped to the FlatBuffers type system. This specification explicitly maps each attribute. | ||
|
|
||
| ### 2.1 Type System Mapping | ||
|
|
||
| The CloudEvents type system MUST be mapped to FlatBuffers types as follows: | ||
|
|
||
| | CloudEvents | FlatBuffers | | ||
| |----------------|------------------------------------------| | ||
| | Boolean | bool | | ||
| | Integer | int | | ||
| | String | string | | ||
| | Binary | [ubyte] (vector of unsigned bytes) | | ||
| | URI | string following RFC 3986 §4.3 | | ||
| | URI-reference | string following RFC 3986 §4.1 | | ||
| | Timestamp | string following RFC 3339 (ISO 8601) | | ||
|
|
||
| Extension specifications MAY define secondary mapping rules for the values of attributes they define, but MUST also include the previously defined primary mapping. | ||
|
|
||
| ### 2.2 REQUIRED Attributes | ||
|
|
||
| The following attributes are REQUIRED to be present in all CloudEvents: | ||
|
|
||
| - **id**: string (required) | ||
allen-munsch marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - **source**: string (required) | ||
| - **specversion**: string (required) | ||
| - **type**: string (required) | ||
|
|
||
| ### 2.3 OPTIONAL Attributes | ||
|
|
||
| CloudEvents Spec defines OPTIONAL attributes. In FlatBuffers, optional fields are nullable by default and do not require special union syntax. | ||
|
|
||
| The following OPTIONAL attributes are defined: | ||
|
|
||
| - **datacontenttype**: string | ||
| - **dataschema**: string | ||
| - **subject**: string | ||
| - **time**: string | ||
|
|
||
| ### 2.4 Extension Attributes | ||
|
|
||
| CloudEvents allows arbitrary extension attributes. These are represented using a key-value table structure in FlatBuffers. | ||
|
|
||
| ### 2.5 Schema Definition | ||
|
|
||
| Users of FlatBuffers MUST use a schema compatible with the CloudEvent FlatBuffers Schema: | ||
|
|
||
| ```fbs | ||
| namespace io.cloudevents; | ||
|
|
||
| // Key-value pair for extension attributes | ||
| table ExtensionAttribute { | ||
allen-munsch marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| key: string (required); | ||
| value: string (required); | ||
| } | ||
|
|
||
| // Main CloudEvent record | ||
| table CloudEvent { | ||
| // REQUIRED attributes | ||
| id: string (required); | ||
| source: string (required); | ||
| specversion: string (required); | ||
| type: string (required); | ||
|
|
||
| // OPTIONAL attributes | ||
| datacontenttype: string; | ||
| dataschema: string; | ||
| subject: string; | ||
| time: string; | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How are these optional field serialized? For example, if "subject" is missing, is there still some "subject" type of entry there? Does the word "subject" actually appear in the serialization?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To reproduce the test can be found in this gist: https://gist.github.com/allen-munsch/295ab0b944ae0c2816896945e3f168a2 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks. You may have answered this during the previous call, but how do receivers get the fbs/schema files? Are they just known in advance or are they shared via some other mechanism?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think
I see 2 approaches for
|
||
|
|
||
| // Extension attributes | ||
| extensions: [ExtensionAttribute]; | ||
allen-munsch marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| // Event data payload | ||
| data: [ubyte]; | ||
| } | ||
|
|
||
| root_type CloudEvent; | ||
| ``` | ||
|
|
||
| ## 3. Data | ||
|
|
||
| Before encoding, the FlatBuffers serializer MUST first determine the runtime data type of the content. This can be determined by examining the data for invalid UTF-8 sequences or by consulting the `datacontenttype` attribute. | ||
|
|
||
| The `data` field is represented as a vector of unsigned bytes (`[ubyte]`), which can store: | ||
|
|
||
| - Binary data directly | ||
| - UTF-8 encoded strings | ||
| - JSON serialized as UTF-8 bytes | ||
| - Any other serialized format | ||
|
|
||
| The implementation MUST store the data value in the `data` field as a byte vector. | ||
|
|
||
| ## 4. Transport | ||
|
|
||
| Transports that support content identification MUST use the following designation: | ||
|
|
||
| ``` | ||
| application/cloudevents+flatbuffers | ||
| ``` | ||
|
|
||
| ## 5. Schema Evolution | ||
|
|
||
| FlatBuffers provides robust schema evolution capabilities that are essential for event-driven architectures where producers and consumers evolve independently. | ||
|
|
||
| ### 5.1 Compatibility Guarantees | ||
|
|
||
| FlatBuffers achieves forward and backward compatibility through an offset-based vtable (virtual table) system: | ||
|
|
||
| - **Forward compatibility**: Readers using older schemas can read data written with newer schemas by ignoring unknown fields | ||
| - **Backward compatibility**: Readers using newer schemas can read data written with older schemas by treating missing fields as unset | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can't help but wonder if extensions could be handled in a similar way to how these compatibility rules work. Meaning, what if we treat the presence (and serialization) of an extension as if it were a "newer schema" and then Readers can choose to process the extension if they know about it, or ignore it if they don't. Then we don't need a special "extensions" table.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would this mean the cloud event envelope would be tied to each extension type, it might require re-compiling the envelope? Is extension meant as arbitrary bucket? I'm confused a bit, here's how I interpreted what you wrote.
Where There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See the example here: https://github.com/cloudevents/spec/blob/main/cloudevents/spec.md#example In this case, a "spec defined" fbs file could do one of two things: In a previous comment I asked about how receivers get the fbs files. If they are expected to "just have it", then (if I'm understanding things correctly), they'll skip over any unknown fields (extensions) automatically. Which may actually be ok if they don't care about them. However, if they know about certain extensions, then they could use a newer fbs file that defines those - assuming there's only one set of extensions they might receive. Right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In a strongly typed/defined environment where both sender and receivers MUST have a prior knowledge of each other's schema, then I think it would be better to just create the fbs file with the extensions as top-level fields. But if you're in a more dynamic world then I think the "bucket" you've defined might be the only choice. If I'm correct - the question would then be... is that too complicated to mention this choice? or is it better to just have one proposed solution? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we talked a bit about this on today's call and we're leaning towards what are have here... define a bucket for extensions. While the other approach is possible, it'll probably lead to more confusion. If you get a chance, can you fix the PR in our repo? The text still isn't quite right
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I will take a look. I should have gotten this to you sooner, but I think the flat on root approach would be possible like this: However, if the consensus is extensions, then that is okay too, personally I liked your suggestion to align on CE spec, cause flatbuffer certainly would support it, I think. |
||
|
|
||
| This is accomplished through: | ||
|
|
||
| 1. **Vtable indirection**: Fields are accessed via offset lookups in a vtable rather than fixed positions | ||
| 2. **Field IDs**: Fields are identified by their declaration order, not their position in the buffer | ||
| 3. **Optional field detection**: Readers detect missing fields when vtable entries are absent (offset = 0) | ||
|
|
||
| ### 5.2 CloudEvent Envelope Schema Evolution | ||
|
|
||
| The CloudEvent envelope has required and optional fields as defined in section 2.5. | ||
|
|
||
| **Required fields** (id, source, specversion, type): | ||
| - Must always be present in all CloudEvents | ||
| - Cannot be removed or made optional | ||
|
|
||
| **Optional fields** (datacontenttype, dataschema, subject, time): | ||
| - May be present or absent | ||
| - Readers treat missing optional fields as unset/null | ||
| - Can be added or removed without breaking compatibility | ||
|
|
||
| **Extension attributes:** | ||
| - Dynamic key-value pairs in the extensions array | ||
| - Keys can be added or removed freely without schema changes | ||
| - Readers process only recognized keys and ignore unknown keys | ||
|
|
||
| ### 5.3 Implementation Considerations | ||
|
|
||
| **Buffer structure:** | ||
| - FlatBuffers serializes tables with vtable headers containing field offset mappings | ||
| - Optional fields that are not set occupy no space in the buffer | ||
| - Field access requires vtable lookup followed by offset-based memory access | ||
|
|
||
| **Performance characteristics:** | ||
| - Zero-copy deserialization enables direct buffer access | ||
| - Optional field access has constant-time overhead (vtable lookup) | ||
| - Missing field detection is O(1) via vtable entry check | ||
|
|
||
| **Migration best practices:** | ||
| - Deploy consumers before producers when adding optional fields | ||
| - Maintain compatibility windows when deprecating fields | ||
| - Use feature flags or configuration to control field usage during transitions | ||
| - Test compatibility with both old and new schema versions | ||
|
|
||
| ### 5.4 Cross-Architecture Compatibility | ||
|
|
||
| FlatBuffers guarantees binary compatibility across heterogeneous architectures through its wire format specification and the `dataschema` field mechanism. | ||
|
|
||
| **Wire format guarantee:** | ||
| - All FlatBuffers data MUST be serialized in little-endian byte order | ||
| - FlatBuffers implementations handle endianness conversion transparently | ||
| - Consumers using the schema from `dataschema` can deserialize data regardless of architecture | ||
|
|
||
| **Schema evolution across architectures:** | ||
| - Adding or removing optional fields is safe across all architectures | ||
| - Field presence is determined by schema-defined vtable entries, not byte patterns | ||
| - The `dataschema` URI ensures both producer and consumer use compatible schemas | ||
| - Endianness conversion is handled by FlatBuffers runtime, not application code | ||
|
|
||
| **Key principle:** | ||
| When both producer and consumer obtain their schema from the same `dataschema` URI, FlatBuffers guarantees correct deserialization regardless of underlying processor architecture (x86-64, ARM, SPARC, PowerPC) or endianness (little-endian or big-endian). Schema evolution operations (adding/removing optional fields) maintain this guarantee. | ||
|
|
||
| ## 6. Examples | ||
|
|
||
| ### 6.1 Using dataschema for Dynamic Deserialization | ||
|
|
||
| The `dataschema` field enables dynamic schema resolution and runtime compilation for FlatBuffers data payloads: | ||
|
|
||
| **Producer workflow:** | ||
| 1. Compile domain schema (e.g., `UserCreated.fbs`) using `flatc` | ||
| 2. Serialize data structure into FlatBuffers binary format | ||
| 3. Set `dataschema` to URI hosting the `.fbs` schema file | ||
| - Example: `https://schemas.example.com/events/v1/UserCreated.fbs` | ||
| - Alternative: `https://registry.example.com/schemas/v2/UserCreated.fbs` | ||
| - Alternative: `fbs://internal-registry/com.example.events.UserCreated/v1` | ||
| 4. Populate `data` field with serialized bytes | ||
| 5. Set `datacontenttype` to `application/cloudevents+flatbuffers` | ||
|
|
||
| **Alternative producer workflow (JSON data):** | ||
| 1. Serialize data structure into JSON format | ||
| 2. Set `dataschema` to URI hosting the JSON Schema file | ||
| - Example: `https://schemas.example.com/events/v1/UserCreated.json` | ||
| 3. Populate `data` field with UTF-8 encoded JSON bytes | ||
| 4. Set `datacontenttype` to `application/json` | ||
|
|
||
| **Consumer workflow (first encounter):** | ||
| 1. Extract `dataschema` URI from CloudEvent | ||
| 2. Fetch `.fbs` schema file via HTTP GET | ||
| 3. Compile schema dynamically using `flatc` or FlatBuffers reflection API | ||
| 4. Cache compiled schema artifacts keyed by URI | ||
| 5. Deserialize `data` bytes using compiled schema | ||
| 6. Process deserialized structure | ||
|
|
||
| **Consumer workflow (cached schema):** | ||
| 1. Extract `dataschema` URI | ||
| 2. Lookup cached compiled schema | ||
| 3. Deserialize `data` bytes directly | ||
| 4. Process deserialized structure | ||
|
|
||
| This approach enables schema evolution without requiring consumers to update code or redeploy when new event types are introduced. The reflection API allows runtime buffer navigation without code generation, while dynamic `flatc` invocation provides strongly-typed accessors. | ||
|
|
||
| ### 6.2 Exemplary Mappings | ||
|
|
||
| The following table shows exemplary mappings: | ||
|
|
||
| | CloudEvents Attribute | Type | Exemplary FlatBuffers Value | | ||
| |-----------------------|--------|--------------------------------------------------| | ||
| | id | string | "7a0dc520-c870-4193c8" | | ||
| | source | string | "https://github.com/cloudevents" | | ||
| | specversion | string | "1.0" | | ||
| | type | string | "com.example.object.user.created.v2" | | ||
| | datacontenttype | string | "application/cloudevents+flatbuffers" | | ||
| | dataschema | string | "https://schemas.example.com/events/v2/UserCreated.fbs" | | ||
| | subject | string | "user-created-123456789.fbs" | | ||
| | time | string | "2019-06-05T23:45:00Z" | | ||
| | data | [ubyte]| [FlatBuffers serialized bytes] | | ||
|
|
||
| ## References | ||
|
|
||
| - [Flatbuffers](https://github.com/google/flatbuffers/) | ||
| - [Google PubSub Spec](https://github.com/googleapis/google-cloudevents/blob/main/docs/spec/pubsub.md) | ||
| - [CloudEvents Specification v1.0](https://github.com/cloudevents/spec/blob/v1.0/spec.md) | ||
| - [Avro Compact CloudEvents](https://github.com/cloudevents/spec/blob/main/cloudevents/working-drafts/cloudevents-compact.avsc) | ||
| - [FlatBuffers Documentation](https://google.github.io/flatbuffers/) | ||
| - [RFC 3339 - Date and Time on the Internet: Timestamps](https://tools.ietf.org/html/rfc3339) | ||
| - [RFC 3986 - Uniform Resource Identifier (URI): Generic Syntax](https://tools.ietf.org/html/rfc3986) | ||
|
|
||
| ## License | ||
|
|
||
| This specification follows the same license as the CloudEvents project (Apache 2.0). | ||
|
|
||
| ## Contributing | ||
|
|
||
| This is a work-in-progress specification. Feedback and contributions are welcome through the CloudEvents working group. | ||
Uh oh!
There was an error while loading. Please reload this page.