Apply review fixes to architecture docs

rsantacroce · rsantacroce · commit 79fd214606cb · 2026-05-17T18:54:15.000+02:00
Reviewer-confirmed corrections from evoskuil across 12 docs: - 02: milestone allows validation bypass (not chain-fixing); chaser_block skips milestones because blocks-first has no PoW DoS guard; debug-only checks are !NDEBUG; LRU eviction on tree_ would create a new DoS vector. - 03: get_inventory_size gates on candidate-chain currentness, with the weak-chain rationale (not "wait until caught up"). - 04: consensus is split across headers, block-receive, this chaser, and confirm; intro softened from "single source of consensus acceptance". - 05: !NDEBUG polarity fix; expanded block_confirmable to describe strong-tx association, maturity, and relative-locktime rules. - 06: session-template class diagram now shows all three instantiations; recent != current (max-height config for testing). - 08: superseded_ is atomic because superseded() is protected and read non-stranded from the base. - 09: unhandled channel messages are ignored, not protocol_violation. - 10: sub1/add1 was a real off-by-one bug, fixed in PR libbitcoin#1007. - 11: order-discipline is the same as headers-first; BIP130 typo fix. - 12: chaser_storage timer runs on the chaser's strand (network threadpool), not a separate pool. - 00, README: roll-up updates.
diff --git a/docs/architecture/00-overview.md b/docs/architecture/00-overview.md
@@ -155,9 +155,12 @@ its *own* mutations while still running in parallel with the other chasers
 on its own strand … allowing concurrent chaser operations to the extent that
 threads are available"*).
 
-This is the **central source of parallelism** in the node. The chasers form
-a pipeline; each stage runs on its own strand and they communicate by
-publishing events.
+This is one of the two main axes of parallelism in the node. The chasers
+form a pipeline; each stage runs on its own strand and they communicate
+by publishing events. The other axis, equally important, is **per-channel
+strands**: every peer connection also runs on its own strand. Peers and
+chasers therefore execute concurrently with each other, bounded only by
+the shared threadpool size.
 
 ---
 
diff --git a/docs/architecture/02-chaser-organize.md b/docs/architecture/02-chaser-organize.md
@@ -469,8 +469,13 @@ Exit paths:
 
 ### 6.2 Header milestone tracking (`chaser_header` only)
 
-A *milestone* is a configured `(hash, height)` pair that fixes the chain.
-Functionally similar to a checkpoint but mutable per node settings.
+A *milestone* is a configured `(hash, height)` pair. Unlike a
+checkpoint, a milestone does **not** fix the chain — the node can
+still reorganise around it. What a milestone *does* is allow the
+**bypass of validation** of all blocks up to the milestone height,
+*if* the milestone is found in the active candidate chain. So a
+milestone is an operational optimisation gated on the node's own
+configuration, not a consensus commitment.
 
 State: `active_milestone_height_` is the height of the *most recent
 milestone observed on the current candidate*. Initialised by
@@ -493,8 +498,17 @@ and:
 > the candidate is reorganized below the milestone (a rare event), and
 > only via `update_milestone`.
 
-`chaser_block` skips milestones entirely. The full block already carries
-enough state to validate without the heuristic.
+`chaser_block` skips milestones entirely, for a different reason than
+"the full block carries enough state". The blocks-first design has no
+PoW guard before archival: a peer can flood the node with full
+blocks, and without an upstream headers-first chain to gate which
+blocks are *worth* the work, the node has no cheap way to refuse
+malicious blocks short of validating them. (One could imagine running
+headers-first internally by downloading every block and stripping its
+txs — but that is prohibitively expensive and redundant with running
+headers-first directly.) So in blocks-first mode, **every block must
+be validated before archival**, full stop; bypass-on-milestone would
+defeat the only DoS guard the mode has.
 
 ---
 
@@ -523,8 +537,8 @@ store-corruption error). For a formal model, each is a proof obligation:
 | `organize13`       | `ipp:428-432`                       | disorganize: `get_candidate_chain_state(fork_point)` after rebuild returned null                                                   |
 | `organize14`       | `ipp:515` (in `push_block`)         | `set_organized` failed after a successful `set_code` (we archived but couldn't push to candidate)                                  |
 | `organize15`       | `ipp:521-523` (in `push_block(key)`)| Tree extract returned no handle (item was missing when expected)                                                                   |
-| `stalled_channel`  | `ipp:471-475` (in `set_organized`, NDEBUG-only check) | Candidate height isn't `top+1` (broken sequencing)                                                                    |
-| `suspended_channel`| `ipp:477-483` (in `set_organized`, NDEBUG-only check) | Parent of new candidate isn't current top (broken sequencing)                                                         |
+| `stalled_channel`  | `ipp:471-475` (in `set_organized`, debug-only `!NDEBUG` check) | Candidate height isn't `top+1` (broken sequencing). Redundant safety check; release builds skip it.          |
+| `suspended_channel`| `ipp:477-483` (in `set_organized`, debug-only `!NDEBUG` check) | Parent of new candidate isn't current top (broken sequencing). Redundant safety check; release builds skip it. |
 
 > **Spec obligation list.** A formal model should be able to discharge
 > `organize2` through `organize15` as unreachable, given:
@@ -570,8 +584,11 @@ store-corruption error). For a formal model, each is a proof obligation:
   factoring in §4.1.
 
 - `tree_` is naturally a `hash-table` keyed by header hash. The DoS
-  concern (`§6.1 TODO`) can be enforced by a size cap with a
-  least-recently-used eviction.
+  concern flagged at `§6.1 TODO` is real but the obvious mitigation
+  (an LRU eviction cap) is **not** the right answer — that would open
+  a new DoS vector where an attacker forces eviction of legitimate
+  weak branches. The right fix is more subtle and not solved here;
+  treat the unbounded-tree assumption as load-bearing.
 
 - `update_milestone` walks `tree_` by parent-hash chain — straightforward
   recursion.
diff --git a/docs/architecture/03-chaser-check.md b/docs/architecture/03-chaser-check.md
@@ -330,10 +330,19 @@ generation. It:
 
 `get_inventory_size` (`chaser_check.cpp:534-543`):
 
-- Returns 0 if no connections OR if the *confirmed* chain isn't current
-  (so no inventory work issues until the node is reasonably caught up).
+- Returns 0 if no connections OR if `is_current(false)` — i.e. the
+  **candidate** (header) chain isn't current.
 - Otherwise: `ceilinged_divide(unassociated_count_above(fork, step), connections)`.
 
+The candidate-current gate is *not* "wait until caught up" — issuing
+zero work until the node is caught up would just stall (you can't get
+caught up without downloading). Instead it guards against dividing
+work over a *weak* header chain: until headers are current, the
+partitioning of unassociated heights into per-peer inventory chunks
+could be meaningless or wrong, so block download is paused until
+header sync has produced a current (and therefore presumed
+near-canonical) candidate.
+
 > **Invariant (Check-Inventory-1).** `inventory_` is computed at most
 > once (latch on first nonzero result). Stored in
 > `chaser_check.cpp:496-498`. This is intentional: peer inventory size is
diff --git a/docs/architecture/04-chaser-validate.md b/docs/architecture/04-chaser-validate.md
@@ -17,9 +17,28 @@
 > - it emits `chase::valid(height)` on success or `chase::unvalid(link)`
 >   on failure
 >
-> This is the chaser most directly relevant to **formal verification**: it
-> is the single source of consensus acceptance, and the only place script
-> execution and UTXO availability are checked before confirmation.
+> This is the chaser most directly relevant to **formal verification**:
+> it is where script execution and prevout (UTXO availability) checks
+> happen before confirmation. Consensus is **not** confined to this
+> chaser however — it is split across several stages:
+>
+> - **Headers** (`chaser_header::validate`) — context-free header
+>   consensus (proof-of-work, version, etc.) plus limited-context checks.
+> - **Blocks at receive time** (`protocol_block_in_31800::check`,
+>   `chaser_block::validate` in blocks-first mode) — context-free
+>   block-level consensus (size, sigops, commitments) and limited
+>   context checks (no prevouts yet).
+> - **This chaser** — full block consensus with prevouts populated:
+>   `block.accept(ctx, …)` (block-wide rules) and `block.connect(ctx)`
+>   (script execution per input).
+> - **`chaser_confirm`** — block-relative order-based consensus checks
+>   via `query.block_confirmable(link)`: previous-output is confirmed in
+>   a "strong" tx, maturity, relative-locktime rules (see
+>   [`05 §10.4`](05-chaser-confirm.md#104-the-utxo-oracle)).
+> - **Transactions** (planned) — same shape, not yet implemented.
+>
+> So this chaser is the **largest single block of consensus work** and
+> the natural focal point of a formal model, but not the sole source.
 
 | File                                                                  | Role                                                                                  |
 | --------------------------------------------------------------------- | ------------------------------------------------------------------------------------- |
diff --git a/docs/architecture/05-chaser-confirm.md b/docs/architecture/05-chaser-confirm.md
@@ -442,8 +442,8 @@ All terminal (call `fault`, suspend network).
 | `confirm10`           | `chaser_confirm.cpp:298`              | `roll_back` failed                                                                                              |
 | `confirm11`           | `chaser_confirm.cpp:308`              | `set_filter_head` failed before `set_block_confirmable`                                                         |
 | `confirm12`           | `chaser_confirm.cpp:314`              | `set_block_confirmable` failed                                                                                  |
-| `suspended_channel`   | `chaser_confirm.cpp:379` (NDEBUG-only)| `confirmed_height != top+1` in `set_organized` — sequencing bug                                                  |
-| `suspended_service`   | `chaser_confirm.cpp:387` (NDEBUG-only)| `to_parent(link) != to_confirmed(previous_height)` — parent mismatch                                            |
+| `suspended_channel`   | `chaser_confirm.cpp:379` (debug-only check, `!NDEBUG`) | `confirmed_height != top+1` in `set_organized` — sequencing bug. Redundant safety check; no effect in release builds. |
+| `suspended_service`   | `chaser_confirm.cpp:387` (debug-only check, `!NDEBUG`) | `to_parent(link) != to_confirmed(previous_height)` — parent mismatch. Redundant safety check; no effect in release builds. |
 
 > **Spec obligation list.** As with organize/validate, every `confirmN`
 > is unreachable under store-consistency invariants plus the strand
@@ -528,23 +528,55 @@ chaser_confirm : Process
   no stall when `chase::valid` arrived during the in-progress
   iteration.
 
-### 10.4 The UTXO oracle
+### 10.4 What `query.block_confirmable(link)` actually checks
 
-`query.block_confirmable(link)` is the UTXO double-spend check. Its
-correctness is the responsibility of libbitcoin-database. For a formal
-model, treat it as:
+This is a **significant consensus operation**, not a narrow
+double-spend probe. It evaluates *all block-relative, order-based
+consensus constraints* on a block — everything except header chaining
+and the chain-summation rules (cumulative work, MTP). Specifically:
+
+1. **Strong-tx association for every spent prevout.** For every input,
+   the previous output must be in a *strong* transaction — i.e. a tx
+   that is associated to a block which is either (a) confirmed, or (b)
+   in the confirmable candidate fork at *lesser* height than the
+   spending block. This is the property that subsumes the "is the
+   prevout in the UTXO set?" question, expressed in a tx→block
+   association model rather than a separate UTXO snapshot.
+
+2. **Coinbase maturity** (BIP34 / 100-confirmation rule for spending
+   coinbase outputs).
+
+3. **Relative locktime rules** (BIP68 sequence locks).
+
+These checks also exist on the `system::chain` objects (for
+completeness, e.g. for stand-alone validation tools), but driving them
+there requires populating each input's metadata first. The store
+optimizes by performing the queries that *would have populated that
+metadata*, directly — much more efficient than populate-then-check.
+
+So `block_confirmable` is correctly read as: *"under the assumption
+that all lower-height blocks in this fork are confirmable, is this
+block consensus-valid against all order-sensitive rules?"* Its
+correctness is the joint responsibility of the libbitcoin-database
+query implementation and the consensus rules it encodes; a formal
+model should treat the call as a non-trivial proof obligation, not a
+thin UTXO oracle.
+
+For a model:
 
 ```
-block_confirmable(link, store_state) →
-    Right(())               if every input refers to a UTXO present in store_state,
-                            and double-spend checks pass
+block_confirmable(link, fork, store_state) →
+    Right(())               if for every input in block(link):
+                              - prevout tx is strong under (store_state ∪ fork[< height])
+                              - coinbase maturity satisfied
+                              - relative-locktime constraints satisfied
+                              - no double-spends within fork[≤ height]
     Left(error_code)        otherwise
 ```
 
-The chaser sequences calls so that `store_state` at the moment of
-`block_confirmable(link)` reflects all blocks confirmed below `link`'s
-height in this fork (because `set_block_confirmable` for prior heights
-has already run by the loop ordering at `chaser_confirm.cpp:230-275`).
+The chaser sequences calls so that prior fork blocks have already had
+`set_block_confirmable` written by the time `block_confirmable(link)`
+runs for this block (loop ordering at `chaser_confirm.cpp:230-275`).
 
 ---
 
diff --git a/docs/architecture/06-sessions-and-protocols.md b/docs/architecture/06-sessions-and-protocols.md
@@ -39,6 +39,17 @@ to protocols.
 
 ### 1.1 Session hierarchy
 
+`session_peer<NetworkSession>` is a class template whose
+`NetworkSession` parameter is instantiated separately for each
+concrete session: `network::session_outbound`,
+`network::session_inbound`, and `network::session_manual`. The
+template inherits *from its parameter* (the network base) and *also*
+from `node::session` (the mixin) — so each instantiation produces a
+different concrete network-base parent. The class diagram below shows
+the outbound instantiation explicitly; the inbound and manual
+instantiations are structurally identical, parameterised on their
+respective `network::session_*` base.
+
 ```mermaid
 classDiagram
     class network_session["network::session"] {
@@ -58,11 +69,13 @@ classDiagram
     class network_session_outbound["network::session_outbound"]
     class network_session_inbound["network::session_inbound"]
     class network_session_manual["network::session_manual"]
-    class session_peer["session_peer&lt;NetworkSession&gt; (template)"] {
+    class session_peer_out["session_peer&lt;network::session_outbound&gt;"] {
         +create_channel (override)
         +attach_handshake (override)
         +attach_protocols (override)
     }
+    class session_peer_in["session_peer&lt;network::session_inbound&gt;"]
+    class session_peer_man["session_peer&lt;network::session_manual&gt;"]
     class session_outbound
     class session_inbound {
         +enabled() override
@@ -72,14 +85,21 @@ classDiagram
     network_session <|-- network_session_outbound
     network_session <|-- network_session_inbound
     network_session <|-- network_session_manual
-    network_session_outbound <|-- session_peer
-    node_session <|-- session_peer
-    session_peer <|-- session_outbound
-    session_peer <|-- session_inbound
-    session_peer <|-- session_manual
-
-    note for session_peer "Multiply derived:\n• node::session for chaser/bus access\n• network::session_* for socket lifecycle"
-    note for node_session "All methods forward to full_node"
+
+    network_session_outbound <|-- session_peer_out
+    network_session_inbound  <|-- session_peer_in
+    network_session_manual   <|-- session_peer_man
+
+    node_session <|-- session_peer_out
+    node_session <|-- session_peer_in
+    node_session <|-- session_peer_man
+
+    session_peer_out <|-- session_outbound
+    session_peer_in  <|-- session_inbound
+    session_peer_man <|-- session_manual
+
+    note for session_peer_out "Three template instantiations,\nidentical except for which\nnetwork::session_* is the\nnetwork-side base."
+    note for node_session "Mixin: all methods\nforward to full_node."
 ```
 
 The `node::session` mixin (`src/sessions/session.cpp:35-160`) is **pure
@@ -115,10 +135,16 @@ bool session_inbound::enabled() const NOEXCEPT
 > **Invariant (Session-Inbound-1).** Inbound connection attempts are
 > rejected (the network layer disables the listener) until either
 > `delay_inbound == false` *or* the confirmed chain is "recent". The
-> definition of "recent" is the same as `full_node::is_recent` — top
-> equals configured max height *or* top timestamp is within the
-> `currency_window` (`src/full_node.cpp:415-425`). This prevents a
-> not-yet-caught-up node from serving stale data.
+> definition of "recent" is `full_node::is_recent` — top equals
+> configured max height *or* top timestamp is within the
+> `currency_window` (`src/full_node.cpp:415-425`). Note that **recent
+> is weaker than current**: "recent" considers the configured
+> `node.maximum_height`, so a node deliberately limited to a fixed
+> height (typically for testing) can activate inbound service at that
+> ceiling even though it would never satisfy the time-based
+> "currentness" test. This prevents a not-yet-caught-up node from
+> serving stale data in normal deployments while still allowing
+> bounded-height test deployments.
 
 This is implemented via `enabled()` rather than the bus
 `suspend`/`resume` mechanism so that the listener has independent
diff --git a/docs/architecture/08-block-out-protocols.md b/docs/architecture/08-block-out-protocols.md
@@ -324,14 +324,15 @@ When `superseded_` flips true:
 
 ### 3.3 Why `superseded_` is `std::atomic_bool`
 
-`superseded_` is written in `handle_receive_send_headers` (channel
-strand) and read in `handle_event` (bus subscriber's strand, which
-posts back to channel strand for actual processing). In practice both
-are the channel strand — see
-[`06 §3.1`](06-sessions-and-protocols.md#31-event-subscription-protocol)
-on subscription posting back to channel strand. The atomic is
-defensive; a non-atomic `bool` would likely be sound, but the cost is
-negligible.
+`protocol_block_out_70012::superseded()` is **`protected`**, so the
+derived class (`protocol_block_out_70012`) exposes read access to
+its own base (`protocol_block_out_106::handle_event`) which uses it
+as the supersede gate. The base reads `superseded()` from its bus
+handler context; the derived class writes `superseded_` from its
+`handle_receive_send_headers` channel handler. Making the flag
+atomic allows that read to happen **without** posting through the
+channel strand — i.e. the gate is non-stranded by design, and the
+atomic is what makes that safe.
 
 ---
 
diff --git a/docs/architecture/09-filter-out-70015.md b/docs/architecture/09-filter-out-70015.md
@@ -282,14 +282,19 @@ The pattern exists because of two constraints:
    create unbounded queueing.
 
 Unsubscribing for the duration of the stream **serializes** requests
-without explicit locks: a peer's second `get_client_filters` arrives
-when there is no handler for it, which the libbitcoin-network channel
-treats as a *protocol violation* and drops the peer.
-
-> **Invariant (Filter-Stream-4).** A peer that issues a second
-> `get_client_filters` before the first completes will be dropped by
-> the channel layer (not by this protocol). This effectively makes
-> `getcfilters` request/response exclusive per channel.
+without explicit locks. A peer's second `get_client_filters` arriving
+while the first is in flight has no handler registered; the
+libbitcoin-network channel currently **ignores** unhandled messages
+(this may be tightened in future). The serializing effect therefore
+comes from the protocol simply not seeing the second request until it
+re-subscribes — not from peer drops.
+
+> **Invariant (Filter-Stream-4).** While streaming, any additional
+> `get_client_filters` from this peer is dropped on the floor (no
+> handler). The first request completes; only after re-subscription
+> can another arrive. `getcfilters` is therefore *effectively*
+> serialized per channel, even though no explicit lock is held and
+> no peer-drop policy enforces it.
 
 ---
 
diff --git a/docs/architecture/10-tx-protocols.md b/docs/architecture/10-tx-protocols.md
diff --git a/docs/architecture/11-protocol-block-in-106.md b/docs/architecture/11-protocol-block-in-106.md
diff --git a/docs/architecture/12-periphery-chasers.md b/docs/architecture/12-periphery-chasers.md
diff --git a/docs/architecture/README.md b/docs/architecture/README.md