Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
169 changes: 169 additions & 0 deletions standards/application/inbox.md
Copy link

@plopezlpz plopezlpz Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I understanding this correctly?, this is the flow:

  1. I share out of band with Bob my inbox_id: one, my e and s, (also my conversation id?)
  2. I subscribe to the content topic for my inbox_id: lower_hex(blake2s("/inbox/one"))
  3. Eventually Bob sends a message to the inbox's content topic (encrypted for me).

Nobody knows that Bob is the sender, people might know that its an inbox I'm using if I share it with others but others might be using the same inbox.

If that is the case I don't know what is the role of the conversation_id

Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
---
title: INBOX
name: Inbound message queues
category: Standards Track
status: raw
tags: chat
editor: Jazz Alyxzander<[email protected]>
contributors:
---
# Abstract

An Inbox is a declaration of where a client is listening for messages, and the protocol for sending them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading the spec, it seems like the purpose of Inbox may be more explicitly stated as the address where a client is waiting for an invite (only) to a conversation that allows it to setup a private conversation elsewhere (out of scope)?

This may indicate that this spec will make more sense as a subsection of a more complete Conversation protocol.


# Background / Rationale / Motivation

Communication protocols often face a fundamental bootstrapping problem: how can two parties establish secure, authenticated communication when they lack trusted channels?
Traditional centralized approaches rely on servers to deliver invitations using pre-agreed upon encryption mechanisms.
This process requires servers to know the recipient's identity so messages can be delivered which leaks metadata.

In a decentralized context, there are no centralized servers to handle delivery, so senders must know where clients are listening for messages.
Protocols have traditionally opted for a static location, which forces the protocol to choose a one-size-fits-all solution when there are inherent trade-offs between privacy and resource usage.

# Theory / Semantics

Inboxes are a standardized, receive-only communication primitive that serves as a secure entry point for initial contact establishment.
Inboxes define where entities can safely receive connection requests, invitations, and introductory messages independent of more complex protocols.

From a usage perspective, inboxes have several unique properties:
- There is no assumption of exclusive usage, many clients can use the same inbox - though they will ignore messages addressed to others.
- There is no associated keypair, messages are encrypted to existing identities, using the defined encryption mechanism.
- There is no restriction on how many inboxes a client can have (cardinality unbounded).
- Developers/Contributors can determine which contacts learn of an inbox.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Developers/Contributors can determine which contacts learn of an inbox.
- Developers/Users can determine which contacts learn of an inbox.

Contributors are developers. But I assume a developer MAY let a user have some control.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contributors are developers.

Contributors are developers who contribute and build novel conversationTypes.

Developers are consumers of the SDK.

The distinction here is that Both applications and conversation specifications can determine inbox visibility

Copy link
Contributor

@fryorcraken fryorcraken Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contributors are developers who contribute and build novel conversationTypes.

Maybe worth defining this in the spec. Or are they spec implementers?


## Parameters

To define an inbox, the following parameters must be set:
- **inbox_address:** a string value of length >=1. This value ought to be considered visible to observers.
Comment on lines +36 to +37
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rephrase this in the form of directives. As it reads I'm not sure where this inbox_address comes from.

For example:

To receive conversation invites,
a participant MUST define an `inbox_address`.
This MUST be any string value with a length of more than 1.
The `inbox_address` SHOULD be advertised to observers who may want to initiate a conversation to this participant.
The mechanism for advertising this address falls outside the scope of this spec.

This kind of directive usually follows in a "Protocol", "Syntax" or "Message Flow" section after the terms have been introduced, so may not belong under "Parameters".
Perhaps we need a separate section after Wire Protocol that is explicit about message flows and which actions are taken by which actors?


## Summary

ConversationId: `/convo/inbox/v1/<client_address>`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea of the ConversationId to tie a single conversation state machine together in the app layer? If that is the case, perhaps it should not be mentioned in this spec (for now), as the current terms seems focused just on the strict interaction of the lower-layer action of initiating a conversation.

A more general point is that I think this spec perhaps makes more sense as a subsection for a more complete "conversations(?)" spec that explains interaction from beginning to end when Alice wants to speak to Bob, is explicit about our prior assumptions, and the message flow from setup to teardown of the conversation. You might already plan this, so not an urgent suggestion. It would be good to make sure that terms are defined either before or just after they're introduced. For example, I'm not sure what the client_address is in this context.

ContentTopic: `lower_hex(blake2s("/inbox/<inbox_address>"))`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not match the recommend content topic format: https://rfc.vac.dev/waku/informational/23/topics#content-topics

I assume this would be the content-topic-name. What do we think re application-name ? I assume we would want to set one for the chat sdk.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this refer to a Waku ContentTopic? I don't think the INBOX spec should know about the filtering concepts in the routing layer. In any case, we would likely not want a separate content topic for each different inbox - this would be a massive load on filter nodes and clients. We likely want to define e.g. 8 content topics for all inbox addresses, with a modulo-hash of the inbox_address more or less randomly distributing inbox addresses to those content topics.

My preference here is usually to define inbox_address in this spec and separately define an implementation spec that is explicit about the entire conversation stack, including Waku usage. We could for example define our content topic preferences in some kind of Conversations Over Waku spec, by a phrasing such as:

...conversations (over Waku) MUST implement <INBOX-SPEC>. The `inbox_address` defined there MUST be used as input to modulo-hash function that selects between `n` content topics, of format...

In other words, someone can implement INBOX without requiring Waku or understanding Waku concepts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment, especially given my comment above, but why blake2 and not something simpler like a farm hash (especially if we're not all that concerned about collisions)?

Copy link
Contributor Author

@jazzz jazzz Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jm-clius

we would likely not want a separate content topic for each different inbox - this would be a massive load on filter nodes and clients.

This seems like a blocker that is worth diving into, as this not only applies to Inboxes but also future conversation types.

Does this refer to a Waku ContentTopic

Yes(ish) - I'm assuming Waku would need to prefix the ContentTopics to align with spec. So the content topics are a generalization which can then be transformed into the required format.


Content-based filtering using 12/FILTER helps reduce resource usage on clients. However, if filtering is limited to static “bins” of message traffic, it won’t effectively alleviate resource constraints for edge clients.

In the case of Inbox, your suggestion works nicely ( assuming one inbox per account), as the messages have a single destination.

However moving to a Conversation-based-addressing approach for private Conversations, this becomes problematic. As messages are addressed to Conversations rather than individual participants - a clients messages are not isolated to a single "bin".

flowchart TD

B0(Bin 0)
B1(Bin 1)
B2(Bin 2)
B3(Bin 3)
B4(Bin 4)

B0 --> CA("Convo A")
B2 --> CB("Convo B")
B3 --> CC("Convo C")
B3 --> CD("Convo D")
Loading

In the diagram above a participant of Conversations A,B,C,D would need to subscribe to 3/5 of the network traffic.

As interesting traffic in conversation-based-addressing approach cannot be isolated to a single "Bin"; all clients end up requiring most of the traffic which makes filtering moot.


What is the limiting factor on the number of subscriptions? Any ideas for how to get around this limitation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First suggestion that comes in mind is considering the Waku content topic as a delivery address (in-line with Delivery Service terminology).

I think in general it is fair to limit the number of delivery addresses an application need to listen to. If you swap Waku for something like Nostr, you could imagine a Nost relay server (+ maybe some inbox id) being a delivery address.

So now it means to send to an inbox, we need 2 piece of information:

  1. Inbox address
  2. Delivery address

We traditionally hashed (1) to end-up in a bucket for (2). Status protocol do:

-> secp private key -> secp public key -> hash + truncate -> topic code -> Waku content topic

The nice properties of that is randomness on the bucket you end up with (derived from the private key randomness), allowing fair distributions of users across the content topic buckets (and then shards via auto-sharding).

We purposely want to move away from secp keys defining your inbox and identity. Hence, I don't think we can copy the old system.

So instead, I would suggest:

  • random generation of a delivery address for account: this should be mostly static, but can be rotated slowly. eg targeting a max of 5 in use for a given installation, and 1-2 for normal usage
  • generation of inbox address as it is

A "full" inbox address then become delivery address + inbox address, where only delivery address is part of the Waku content topic

Based on this parameter, we can then bucket delivery addresses in a flexible manner. Where for example a v1 for chat SDK will bucket all delivery addresses to 10 possible one, and a v2, 100.

Thanks to the flexibility of inbox, upgrading can be easy as a user can listen to both version,until they have fully deprecated their old inboxes.


Finally, let's avoid re-using names that we have across the stack (eg waku content topic vs chat content topic). I think the "content topic" name was already a bad idea knowing underlying stack has "pubsub topic". Let's not repeat history.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the limiting factor on the number of subscriptions?

The number of subscriptions affect the efficiency of filter and store protocols.

Any ideas for how to get around this limitation?

I don't think one should try to get around it when you see it as a delivery address.

But what would be important is for core team to recommend the number of content topics (avg and max) a waku node should aim to subscribe to (relay/fitler/store) to help you decide on best approach. cc @jm-clius.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in general it is fair to limit the number of delivery addresses an application need to listen to.

So instead, I would suggest:
random generation of a delivery address for account: this should be mostly static, but can be rotated slowly. eg targeting a max of 5 in use for a given installation, and 1-2 for normal usage

My concern is that this would mean that the entire networks messaging traffic would need to be sent to the same "delivery address" to stay within that limit. A client's access pattern cannot easily be isolated to single "delivery address", assuming Conversation-based-message-routing is used.
Dividing traffic is easy in Inbox-based-message-routing (the method used in Status), but its resource intensive on the network as messages are copied to each inbox via client-side fanout. Expensive copies of payloads is what makes querying efficient for clients.

There is a trade-off between network load and and client side resource management. If limited to ~2 subscriptions then approaches like MLS could use orders of magnitude more bandwidth that the existing status solution, even though there is a significant drop in overall message counts network wide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all outside of this specification - Although if conversation-based-routing is not feasible then Inboxes may not be the desired solution.

I'm going to pause(if not close) this PR and perform some testing to see what the exposure is before sinking more cycles.

Copy link
Contributor Author

@jazzz jazzz Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference:

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random generation of a delivery address for account: this should be mostly static, but can be rotated slowly. eg targeting a max of 5 in use for a given installation, and 1-2 for normal usage

My concern is that this would mean that the entire networks messaging traffic would need to be sent to the same "delivery address" to stay within that limit. A client's access pattern cannot easily be isolated to single "delivery address", assuming Conversation-based-message-routing is used.

Let's see what @jm-clius says here.

All I can say is that the usage of 100 content topics for Communities were problematic (cc @chaitanyaprem @plopezlpz ) for filter and store.

However, in the context of store, this was coupled with time range queries, and expectation of 30 days message access. Which we are moving away from with SDS.

In terms of filter, 100 content topics did mean slow time to setup the subscriptions and getting the "app ready".

But as @jm-clius's original comment, we still need to understand the expect usage of topics to comment here.

For example:

  • content topic per active conversation (2 or more users).
  • 5 baseline content topic for inbound inboxes

Maybe scalable enough? I can see in my chat app have hundreds of convo, but only maybe 10 are really "active"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my main point is that this spec should exist on its own terms. It makes the assumptions:

  • chat users exist that want to be notified of invitations to new conversations
  • these chat users have access to a transport layer network (any one) that delivers them messages belonging to these inbox addresses

How we "divide up" inbox addresses, conversations, etc. into different filters (or even different routing shards) is an external concern - this spec only requires that the user has access to their messages.

As a rule of thumb, we currently suggest around 30 live content topic subscriptions per client. However, this is likely something that can be increased if we improve our implementation.

I think we could come up with various strategies here to have filter strategies that's compatible both with keeping bandwidth constrained and the filter cardinality reasonable.
It may even be possible to do a content topic per conversation as long as we keep the number of concurrent conversation subscriptions limited (stale conversations could e.g. require a notification to the user's inbox address to reactivate that as a "live" conversation subscription).

Encryption: `NoiseKNfallback`
Encoding: protobuf3

## Invitations / Initialization

Inboxes do not require coordination with other clients to be initialized. Subscribing to messages is sufficient.

Clients ultimately need to notify contacts that the inbox exists in order to receive messages, however this is the responsibility of developers/contributors.

## Content Topic Usage

The content topic that is used is defined by `lower_hex(blake2s("/inbox/<inbox_address>"))`.

The hash function does not provide privacy in this context as an observer can always enumerate inbox_addresses and unmask, because of this it's recommended that inbox_addresses be secret if recipient privacy is desired.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can they enumerate when an inbox_address can be any non-empty string?


## Conversation Id

Messages sent to the inbox MUST use the conversation_id = `/convo/inbox/v1/<client_address>` where `client_address` is the defined address for the Identity you are trying to reach.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the conversation_id? in the payload? accessible before or after decryption?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its defined in the forth coming waku chat spec.

It's accessible before decryption, and is used to determine which encryption state to use for decryption.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ #73


## Accepted types
Inboxes are intended to receive frames and invites from other Conversation types. They are not intended to receive content or arbitrary frames, as not all clients would know how to decode these.
To maintain interoperability each ConversationType is required to define valid payloads for Inboxes.

Older clients may be unable to process newer messages.
Sending clients SHOULD determine what conversation types the receiving client supports, however a mechanism is not provided here.

## Encryption

All Frames sent to the Inbox MUST be encrypted to maintain message confidentiality.

This protocol uses a reversed variant of the [KN noise handshake](https://noiseexplorer.com/patterns/KN/) to secure inbound messages.

```noise
KNfallback:
<- e, s
...
-> e, ee, es
```

In this case the recipient provides both `s` and `e` out of band.

The handshake’s primary purpose is to provide sender confidentiality, with some forward secrecy.
The handshake is similar to a one way N handshake with a recipient side ephemeral key.

Note that this channel does not provide sender authentication, and should only be used to implement a confidential message delivery with some forward secrecy.
This tradeoff is intentional to maintain O-RTT encryption. As this is an inbound pathway further messages to establish mutual authentication with identity hiding would be wasteful.

### Ciphersuite

The noise handshake is implemented with the following functions:

**DH:** X25519
**cipher:** AEAD_CHACHA20_POLY1305
**hash:** BLAKE2s

The noise protocol name would then be `Noise_KNfallback_25519_ChaChaPoly_BLAKE2s`

This protocol opts for 32bit platform optimized variants(where possible) to reduce overhead in mobile and resource constrained environments.

### Endianness

[TODO: The Noiseprotocol specification recommends BigEndian length fields - Need to define if this protocol will deviate]

## Framing
```mermaid
flowchart TD
WapEnvelopeV1 --> EncryptedPayload
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand what WapEnvelopeV1 is? Perhaps TBD in future version of this spec or referencing some other spec?

EncryptedPayload --> D{Decrypt}
D --> InboxV1Frame
```

### EncryptedPayload

The EncryptedPayload message is a self-describing wrapper for all encrypted payloads.
This message type makes no assumptions about the encryption used and allows new conversation types to use the same messaging framework.

As this protocol uses the KN noise handshake, the encoding wrapper uses the corresponding type.

## Wire Format Specification / Syntax

The wire format is specified using protocol buffers v3.

```protobuf

message InboxV1Frame {
string recipient = 1;
oneof frame_type {
... supported invite types
}
}

message EncryptedPayload {

oneof encryption {
NoiseKN noise_KN = 1;
}

message NoiseKN {
bytes encrypted_bytes = 1;
bytes ephemeral_pubkey = 2;
}
}

```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what will be great to see is a MessageFlow section that describes the protocol in a step-by-step manner, being explicit about the purpose of INBOX, the prior assumptions, the actors involved in each step, the messages they send and their direction.

## Security/Privacy Considerations

### Sender Authentication

The encryption scheme used does not provide any sender authentication.
Messages sent over this pathway need to validate the sender before trusting any of the contents.

### EncryptedPayload metadata leakage

Encrypted bytes themselves are not encrypted so its fields are visible to all observers.
Through analytical means observers can determine the type of message being sent, by looking at what fields are present, and the relative size of the payload.
This is true regardless of whether the encrypted bytes are wrapped in a EncryptedPayload object.
Wrapping the payload allows for better support into the future without meaningfully changing the metadata leakage.

## Copyright

Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).


## References

A list of references.
Loading