Reading notes. #1

cheme · 2023-06-30T09:58:45Z

Very nice documentation (I wish I had it when reviewing the prs :)

I open this pr with my reading notes, since the doc is a bit long, something may be a bit off (in later part), but may indicate a reference may be good.
Also it is from my personal scope, so I focus on some points that I find relevant but may not be.

Only thing that may really need a double check: my reading of the clamping function in convention-and-definition (and subsequently in the actual code). I may just have misunderstood the spec though (or was especially tired yesterday evening).

zdave-parity · 2023-07-25T17:51:49Z

src/blockchain-runtime-interface.md

+
+  // TODO this implies that we need to manually register nodes: should be good to define how it is
+  // done (offchain worker putting a extrinsic): this needs to be known by implementors (even if offchain
+  should be called by those correctly there may be some quirks).


What bit of this do you think implies we need to manually register nodes?

As you probably saw on the other PR, Basti has asked me to change how the registration stuff is done... I'll push that update and a spec update soon. As the registrations will be more actively driven by the node, I'll need to say a bit more in this spec...

yes +1 for registration mechanism in the spec (if it is not already, I am not the best reader, just mentioning what was lacking for my comprehension, but sometime I just read badly).

src/conventions-and-definitions.md

zdave-parity · 2023-07-25T18:15:44Z

src/conventions-and-definitions.md

    )
+
+It targets a normal distribution using [ziggurat](https://www.doornik.com/research/ziggurat.pdf) method.
+Note that implementation can diverge here. (TODO meaning using a different random function would be hard to detect and we don't attempt to).


Implementations are not allowed to diverge here. These random numbers must match between implementations or the delays computed by message senders will not match the real delays. This is why I've given a number of examples. Of course a malicious implementation can do whatever it wants, but that goes beyond just using a different random number generator.

I don't really want to go into detail on how the random number generation works here; I figured just saying it must match specific versions of the rand crates would be good enough? The rand library authors promise to provide deterministic portable random number generation where possible; see here. If they change the algorithm in a new version and we are forced to use the new version, we may need to copy the old algorithm into the mixnet crate.

Implementations are not allowed to diverge here. These random numbers must match between implementations or the delays computed by message senders will not match the real delays.

If nothing prevent this, does it need to be part of the spec? maybe something like "is expected to".

Nothing forces a node to do anything that's written in this spec really. I don't think this bit is special in any way?

zdave-parity · 2023-07-25T18:18:52Z

src/cover-traffic.md

 all non-mixnodes in a session; see the [parameters chapter](./parameters.md). Note that the rates
 should be halved in some phases; see the [sessions chapter](./sessions.md#phases).
+
+TODO maybe be explicit if `Node` is `mixnet-node` or if it includes `non-mixnet-nodes`.


I can try rewording this. I thought it was clear enough that it meant both given that it mentions both mixnodes and non-mixnodes?

I am not sure what to do if I am no t a mixnet node, problably I should generate cover traffic, but if all non mixnet node generate cover traffic, then I think we got a problem (I am thinking of non mixnet node as an unbound resource but if we consider them as bounded by the number of accepted node by validator it can be ok, but then we got a problem of node blocking access to others).
So I was more thinking that the sentence did apply only for the mixnet node.
My understanding was that non mixnet node are allowed to do cover and should probably do if they are going to send a real message, but no rules could really apply for them.

All nodes that are connected to the mixnet are expected to generate cover traffic. Ideally we want some nodes that aren't intending to send messages to connect to the mixnet anyway, but we almost certainly don't want all nodes to connect to the mixnet as this would overload it. I don't really have a solution to this at the moment. Perhaps nodes not intending to send messages could connect with some probability determined by the runtime. This would probably be quite slow to react to changes in the total number of nodes, but if that number is fairly stable it could work.

For the initial version, simply enabling the mixnet by default for nodes run with --validator could work.

FWIW the cover traffic rate for non-mixnodes is currently 10x lower than for mixnodes. N mixnodes should comfortably support 10*N non-mixnodes.

Thanks, makes sense, seems important to have those numbers on doc (even if nobody is forced to emit cover messages it can give an idea of the possible behavior).

src/message-fragmentation.md

zdave-parity · 2023-07-25T18:25:14Z

src/message-fragmentation.md

 | 20..21   | `data_size`             | Little-endian number of bytes of message data in this fragment |
 | 22       | `num_surbs`             | Number of SURBs in this fragment                               |
 | 23..2047 | `payload`               | Message data and SURBs                                         |
+// TODO explicitely state if surbs are before or after message.


The paragraph below is explicit about this I think?

yes I did understand after reading the whole spec, (not the paragraph below but the later parts), having it stated before would have facilitate my reading, but really not needed indeed.

src/message-fragmentation.md

zdave-parity · 2023-07-25T18:30:30Z

src/message-fragmentation.md

+// TODO should common parameter like this be part of the runtime api?
 - Multiple fragments with the same `message_id` and `fragment_index` may be received. It is
  recommended that only one be kept.
+// TODO maybe also acceptable to drop all if they differs in content.


Hmm yes, though I'm not sure if it's worth adding the code to detect this?

having it in spec is free (just mentioning it is ok, even if not implemented).

zdave-parity · 2023-07-25T18:33:55Z

src/message-fragmentation.md

 - Multiple fragments with the same `message_id` and `fragment_index` may be received. It is
  recommended that only one be kept.
+// TODO maybe also acceptable to drop all if they differs in content.
+// Explicitelly this means that a packet can be send twice? Or we store only per message_id (in this


Yes, packets may be sent multiple times (although as noted never with the same SURBs). See the stuff about retransmission in requests-and-replies.md. The message ID is reused when retransmitting to increase the chance that we get a complete message (if we were only missing a single fragment the first time, we only care about this fragment making it the second time).

It think it should be stated somewhere explicitly (I mean it is in the spec, but could be good to get the info without reading the whole spec, like at the start of some part).
Sending packet multiple time is a choice of the user, but probably in some circumstances he do not want to do so (this leaks info).

src/message-fragmentation.md

zdave-parity · 2023-07-25T19:20:05Z

src/networking.md

 to the [mixnet topology](./topology.md). If no external addresses have been published for a
 mixnode, or none of them work, nodes should attempt to discover addresses using the libp2p DHT.

+// Here I am not sure, do we have the Dht? or if we are talking about discovery dht, we should name


I'm just talking about the libp2p peer ID --> address DHT. At least I thought that was a thing, and that Substrate would use it if you just asked to connect to a peer ID without providing any addresses? Maybe I'm wrong about this though.

I'm just talking about the libp2p peer ID --> address DHT

that is the one I am calling the discovery dht, I think it is a substrate specific one.

src/networking.md

src/requests-and-replies.md

src/sessions.md

zdave-parity · 2023-07-25T19:40:25Z

src/sphinx.md

 ## MAC verification

 `mac` should equal the BLAKE2b hash of `actions` computed with the key `mac_key`.
+// TODO is it the first 16 byte?


Not sure what you mean sorry? mac and actions are as defined in the table at the top, mac_key is derived as in the "Secret derivation" section.

mac is only 16 byte long, but blake2b output 32bytes.

BLAKE2b actually outputs 64 bytes. The MAC key (16 bytes), actions encryption key (32 bytes), and delay seed (16 bytes) are generated by a single hash:

mac_key . actions_encryption_key . delay_seed = blake2b("sphinx-small-d-s", 0, kx_shared_secret)

Would make sense to indicate it is only the first 16 bytes just to lift any misunderstanding.

zdave-parity · 2023-07-25T19:41:19Z

src/sphinx.md


 | Action   | Description                                | Additional data              |
 |----------|--------------------------------------------|------------------------------|
+// TODO a bit unclear why we would have a Mac when there is on define outside in packet structure.


Hmm yeah. This is the MAC for the next hop, so maybe I could say "MAC for next hop" or something.

i would find "next hop mac" good.

zdave-parity · 2023-07-25T19:42:23Z

src/sphinx.md


 The artificial delay should be calculated as `exp_random(delay_seed) * mean_delay`, where
 `mean_delay` is the mean forwarding delay (see the [parameters chapter](./parameters.md)).
+// TODO state if something prevent not following this rule.


Not sure what you mean -- it should always be possible to do this?

actually "should" means that nothing enforce this.
maybe we can use uper case SHOULD that indicate expected behavior that implementors should follow but that nothing enforce.
oposed to a MUST that should make the node appears like a bad actor.

I've used "may" for optional behaviour. I'm not particularly convinced that distinguishing between must and should would be that useful. At some point there will hopefully be an incentivisation scheme, at which point such a distinction might make sense.

Putting all MAY SHOULD MUST uppercase could make change of behavior (eg when incentivisation required or defined) easier and it is pretty common in specs I think.

src/sphinx.md

zdave-parity · 2023-07-25T19:50:07Z

src/topology.md

 - From the node to itself. For loop cover traffic.

+// TODO here make it explicit if "non-mixnode" are "node".
+// if so also it implies there is no mixnode to mixnode or cover between mixnode (this should maybe


Not sure why this is implied? Mixnodes should generate cover traffic in pretty much the same way as non-mixnodes, they just don't need to worry about the gateway stuff.

as in a previous reply, I really think that non mixnode cannot generate cover traffic as mixnode (exept if their number is bounded, but then we need to define how their priority queue works).

zdave-parity · 2023-07-25T19:54:15Z

src/topology.md

  a non-mixnode is generating a loop route, if there is only a single connected gateway mixnode,
  the gateway must necessarily appear twice.
+  // TODO in this case I would find it better to return an error on route generation and put back
+  // the mixnet in a synching state.


I didn't think reusing a gateway node was a particularly big deal. Allowing it means that non-mixnodes only need to successfully connect to a single mixnode for things to work.

I mean if there is not enough mixnode to do a N hop route, then the N hop security asumption does not hold, it become N-(number of more than one peer access) security, but when sending the query we only did accept sending for a N security, it would need a message like "not enough node in network, do you still want to send your message Y/N"?

I mean my issue here is "unless this is unavoidable" part of the sentence.

The runtime enforces a minimum bound on the number of mixnodes; if there are too few registered, the get-mixnodes function will simply return an error. There are only really two cases where you should end up with a route containing the same mixnode twice:

If you are a mixnode and you're generating a loop route (obviously you will be in the route twice).

If you are not a mixnode, you're generating a loop route, and there is only one connected gateway mixnode (this is the example given, the gateway mixnode needs to appear twice).

I figured the simplest way to word this was "unless this is unavoidable". But maybe this is misleading.

I totally misunderstood the sentence here, I like the detail explanation of case better.
"except for cover loop where first mixnet node is the same as last mixnet node"
(when reading unavoidable I did think of a contextual temporary reason)

If you are not a mixnode, you're generating a loop route, and there is only one connected gateway mixnode (this is the example given, the gateway mixnode needs to appear twice).

When reading I was thinking of route as either the query route or the surb route, so this case is not that relevant to me.
Actually do we consider that surb should not use mixnode from the query route (I think not)?

zdave-parity · 2023-07-25T20:00:54Z

src/topology.md

+  // the mixnet in a synching state.
+  // and state that this rule cannot be check by relaying node (only when creating route).
 - No node should ever appear twice consecutively in a route.
+  // TODO maybe state that this rule can be enforced by relaying node by simply dropping the packet


These rules are really just to avoid poor anonymity of particular packets due to poor route choice. I don't think there's any need for mixnodes to attempt to enforce them; if a node wants to compromise the anonymity of the packets it sends then there are many ways for it to do so.

FWIW this rule is actually almost redundant given the "no mixnode should appear in a route more than once" rule.

FWIW this rule is actually almost redundant given the "no mixnode should appear in a route more than once" rule.

it is (but the "unless it is unavoidable" makes things awkward i think)

notes

90a8ca3