-
-
Notifications
You must be signed in to change notification settings - Fork 380
Description
or "How do we describe time in asynchronous APIs?"
When messages are sent to a channel from multiple sources, it is common for the resulting messages to be slightly out of sequence. This can be because the different producers each have system clocks that are slightly out of sync, or because they take slightly different amounts to time to generate and/or send the events to the channel.
In such situations, developers who create a receiver to perform time-sensitive processing of this stream of messages need to take this into account.
There are ways that they can correctly handle this. Stream processing libraries, frameworks and applications (such as Apache Flink, Apache Spark, Kafka Streams, etc.) characterise this requirement in different ways - but often as handling “late” events or handling events out of sequence. Regardless of the specific technology, the common factor is that to be able to correctly process the stream of messages the consumer needs to know that they will need to handle this, and they need to know the degree of lateness they should expect.
We propose making this a property of channels, because it is a characteristic of the sequence of messages, not of any individual message.
One possible implementation of documenting this could look like this:
components:
messages:
transaction:
payload:
properties:
id:
type: string
format: uuid
description: unique id of a financial transaction
accountnum:
type: string
description: identifier of the account that the transaction applied to
txnvalue:
type: number
description: value of the transaction in some currency
txntimestamp:
type: string
format: date-time
description: official time that the financial transaction applied
channels:
accountTransactions:
messages:
transaction:
$ref: '#/components/messages/transaction'
time:
# Maximum lateness for events on the channel,
# in milliseconds
maxLateness: 180000
# What is the canonical time for each message?
# In the message metadata? Or in the payload?
# If not specified, a reasonable default would
# be to assume the metadata holds the
# reference time
location: payload
# Which property holds the canonical time for
# each message? (applicable when the timestamp
# is in the payload)
attribute: txntimestampExpressing the degree to which events can be out of sequence as “lateness” is a convenient way to put a numeric value to it, and is consistent with the way that it is described in many common stream processing technologies.
Putting this as a property of a “time” object will allow other time-based characteristics of the channel to be documented.
The example shows other values that it could be useful to document, such as how to identify whether the time for a given event should be identified from a property in the payload or in the envelope or metadata.
For developers of message senders, documenting lateness is a contract - a design constraint and commitment to delivering events from all sources within the time they specify.
For developers of message receivers, the lateness will let them to configure the framework or library that they use to correctly process events.
We are looking for feedback on how and where to best document this characteristic of asynchronous APIs.