Skip to content

Latest commit

 

History

History
360 lines (277 loc) · 13.6 KB

File metadata and controls

360 lines (277 loc) · 13.6 KB

Mesh Module

This directory contains the heart of Sneaker++ — the Node implementation, peer management, and peer exchange. If the transport layer is the postal truck, the mesh layer is the dispatch center that decides which trucks go where, tracks every driver's performance, and kicks out the ones who keep losing packages.


Table of Contents

Section Description
How It Works (ELI5) The mesh layer in plain English
NodeImpl The pimpl target behind the public Node class
PeerManager Per-peer state, scoring, eviction, banning
EventLoopPool Per-peer event loop thread pool
ThreadPool Worker thread pool for async tasks
Peer Exchange PEX: sharing peer lists between connected nodes

How It Works (ELI5)

Imagine you're running a neighborhood watch network of walkie-talkies. The mesh module is the person at the central desk who:

  1. Keeps a roster of every active radio operator (peer manager)
  2. Rates their reliability — do they relay messages? do they respond to pings? do they try to replay old messages? (scoring)
  3. Kicks out bad actors — if someone's score drops too low, they get disconnected and temporarily banned (eviction + banning)
  4. Introduces neighbors — "hey, you should talk to Bob at 192.168.1.5, he's reliable" (peer exchange)
  ┌─────────────────────────────────────────────────┐
  │                   NodeImpl                       │
  │                                                  │
  │  ┌──────────────────────┐  ┌──────────────────┐  │
  │  │  PeerEventLoopPool   │  │  PeerManager     │  │
  │  │  (thread-per-peer)   │  │  - PeerContext[] │  │
  │  │  ┌────────────────┐  │  │  - scoring       │  │
  │  │  │ Event Loop 0   │  │  │  - eviction      │  │
  │  │  │  peer A, B     │  │  │  - banning       │  │
  │  │  ├────────────────┤  │  └──────────────────┘  │
  │  │  │ Event Loop 1   │  │                        │
  │  │  │  peer C, D     │  │  ┌──────────────────┐  │
  │  │  ├────────────────┤  │  │  PeerTransport   │  │
  │  │  │ Event Loop N   │  │  │  (per peer)      │  │
  │  │  │  peer ...      │  │  │  - streams       │  │
  │  │  └────────────────┘  │  │  - congestion    │  │
  │  └──────────────────────┘  │  - loss detect   │  │
  │                            │  - ACK state     │  │
  │  ┌────────────┐            └──────────────────┘  │
  │  │ Listen     │                                  │
  │  │ Thread     │            ┌──────────────────┐  │
  │  └────────────┘            │  ThreadPool      │  │
  │                            │  (worker tasks)  │  │
  │  ┌────────────┐            └──────────────────┘  │
  │  │ Timer      │                                  │
  │  │ Thread     │                                  │
  │  └────────────┘                                  │
  │                                                  │
  │  ┌────────────┐                                  │
  │  │ Discovery  │                                  │
  │  │ Thread     │                                  │
  │  └────────────┘                                  │
  │                                                  │
  │  ┌────────────┐                                  │
  │  │ Callback   │                                  │
  │  │ Thread     │                                  │
  │  └────────────┘                                  │
  └─────────────────────────────────────────────────┘

NodeImpl

Header: node.hpp Namespace: sneaker::mesh

The pimpl implementation behind the public sneaker::Node class. Owns all internal state and manages the node's lifecycle threads.

Threads

Thread Function Role
Event Loop Pool PeerEventLoopPool Per-peer event loops: receive, decrypt, build/send packets, handle transport. Peers assigned to least-loaded thread.
Listen listen_loop() Accepts inbound packets on the listen socket, routes to owning event loop thread
Timer timer_loop() Periodic tasks: ping, rekey, chaff, prune, evaluate peers, PTO timeouts
Discovery discovery_loop() Bootstrap, PEX requests, connectivity checks
Callback callback_loop() Fire user callbacks (dedicated thread, batched queue drain)

IO Performance

The IO architecture uses per-peer event loops for high throughput:

  • Thread-per-peer event loops: PeerEventLoopPool assigns peers to event loop threads. Each thread runs epoll/kqueue (POSIX) or IOCP (Windows) processing only its assigned peers.
  • Connected UDP sockets: Each peer gets a connected UDP socket for efficient send/recv without per-packet address lookup.
  • Inline packet processing: Received packets are decrypted and processed inline on the event loop thread. Messages are delivered directly via callback (zero-copy path).
  • PeerTransport-driven send: Each event loop builds packets via PeerTransport::build_packet(), encrypts, and sends. BBR pacing naturally throttles output.
  • Batch send: Collects packets and sends via udp_send_batch() (Linux uses sendmmsg() for a single syscall).
  • Callback batching: The callback thread swaps the entire queue under lock and fires all events outside the lock.

CallbackEvent

Internal event type queued for the callback thread. Encodes all possible user-facing events:

Variant Payload
Message (PeerId, channel, Bytes)
PeerEvent (PeerId, PeerEvent)
Error (Error)
Backpressure (PeerId, bool)
Log (LogLevel, string)

Handshake Flow

When a new peer connects (either inbound or via initiate_connect()):

FUNCTION handle_new_peer(endpoint):
    // --- Step 1: Create temporary PeerContext ----
    // Before the handshake completes, we don't know the peer's
    // PeerId (static public key). So we store them under a
    // temporary key derived from their endpoint.

    temp_key = hash(endpoint)
    ctx = PeerContext(temp_key, endpoint)
    ctx.handshake = HandshakeState(role, our_static)

    // --- Step 2: Exchange 3 Noise messages ----
    // The io_recv_loop handles incoming NOISE_MSG1/2/3 packets
    // and calls the appropriate write_message/read_message.

    // --- Step 3: Handshake completes ----
    // Now we know the peer's static public key (their PeerId).
    // Re-key the PeerContext from temp_key to their real PeerId.
    // Initialize the Connection with the split CipherStates.
    // Create a PeerTransport for this peer.

    result = ctx.handshake.split()
    ctx.id = result.remote_static
    ctx.connection.init(result.send_cipher, result.recv_cipher,
                        result.handshake_hash)
    ctx.transport = PeerTransport(transport_config)
    peer_manager.rekey_peer(temp_key, ctx.id)

    // Fire PeerEvent::CONNECTED to the user callback.

Message Flow

  Application calls node.send(peer_id, data, channel)
         │
         ▼
  PeerTransport::write_stream(channel, data, len)
         │  (length-prefixed into stream buffer)
         │
         ▼
  IO Send thread wakes:
    PeerTransport::build_packet() → plaintext frames
         │
         ▼
    Connection::encrypt_transport() → UDP packet
         │
         ▼
    udp_send_batch() → network

PeerManager

Header: peer_manager.hpp Namespace: sneaker::mesh

Manages the set of all connected and handshaking peers. Each peer is represented by a PeerContext that aggregates transport-layer state (connection, PeerTransport, rekey) with mesh-layer metadata (score, capabilities, RTT).

PeerContext

Field Type Description
id PeerId Peer's Noise static public key
endpoint Endpoint Last-known IP:port
connection Connection Encrypted transport channel
transport PeerTransport QUIC-style stream transport state
rekey RekeyManager In-tunnel key rotation state
score float Reputation score (starts at 100.0)
is_inbound bool True if this peer connected to us
is_public bool True if peer is directly reachable
capabilities_bits uint8_t Advertised capability flags

Scoring

Peers earn or lose reputation points based on their behavior:

Event Score Change Description
MESSAGE_RELAYED +1.0 Successfully relayed a message
PEERS_PROVIDED +2.0 Responded to PEX with useful peers
PUBLICLY_REACHABLE +5.0 Node is reachable without NAT traversal
SUCCESSFUL_INTRODUCTION +3.0 Helped introduce two peers
SUBNET_DIVERSITY +2.0 From a /24 subnet not yet represented
FAILED_PING -5.0 Did not respond to a keepalive ping
INVALID_PACKET -10.0 Sent a packet that failed decryption
REPLAY_ATTEMPT -20.0 Sent a packet with a replayed nonce
PROTOCOL_VIOLATION -25.0 Violated the wire protocol
FAILED_HANDSHAKE -15.0 Handshake timed out or was rejected
HIGH_LATENCY -3.0 RTT above acceptable threshold
VERY_HIGH_LATENCY -8.0 RTT far above acceptable threshold

Eviction & Banning

// Remove peers below the eviction threshold (default 30.0)
peer_manager.evict_worst_peers(target_count);

// Temporarily ban a peer (blocks reconnection)
peer_manager.ban_peer(peer_id, std::chrono::minutes(5));

Endpoint Index

PeerManager maintains a secondary hash index (endpoint_index_) mapping EndpointKey -> PeerIdKey for O(1) endpoint-to-peer lookups:

struct EndpointKey {
    uint8_t ip[16];
    uint16_t port;
    bool is_ipv6;
};

PeerIdKey

A hashable wrapper around uint8_t[32] for use as an unordered_map key. Uses FNV-1a for hashing:

struct PeerIdKey {
    uint8_t data[32];
    bool operator==(const PeerIdKey &other) const;
};

EventLoopPool

Header: event_loop_pool.hpp Namespace: sneaker::mesh

The PeerEventLoopPool manages a pool of event loop threads, each processing a subset of connected peers. Peers are assigned to the least-loaded thread on connect and removed on disconnect.

On POSIX, each thread runs an epoll/kqueue event loop monitoring its peers' connected sockets and notify file descriptors. On Windows, all threads share a single IOCP completion port. Received packets are decrypted and processed inline on the owning thread.

API

PeerEventLoopPool pool(0);  // 0 = auto-detect thread count

pool.set_process_fn([](PeerContext &ctx, uint64_t now_ms) {
    // Called when a peer's socket has data or notify_fd fires
});

pool.start();

// On peer connect
pool.assign_peer(ctx);

// On peer disconnect
pool.remove_peer(ctx);

pool.shutdown();

ThreadPool

Header: thread_pool.hpp Namespace: sneaker::mesh

A simple worker thread pool backed by std::mutex, std::condition_variable, and std::deque<std::function<void()>>:

ThreadPool pool(0);  // 0 = hardware_concurrency - 1, minimum 1

pool.submit([]() {
    // Task runs on a worker thread
});

pool.shutdown();  // Waits for all queued tasks to finish

Peer Exchange

Header: peer_exchange.hpp Namespace: sneaker::mesh

PEX lets connected peers share their peer lists to help the network grow. A node periodically sends PEX_REQUEST control frames to random connected peers, who respond with up to 32 PexEntry records.

PexEntry

struct PexEntry {
    uint8_t public_key[32];     // Peer's static public key
    Endpoint endpoint;          // Last-known IP:port
    uint8_t capabilities;       // Capability flags
    bool is_public;             // Directly reachable?
};

IPv4 entries are 41 bytes; IPv6 entries are 53 bytes.

Rate Limiting

PEX responses are rate-limited to one per 30 seconds per peer. This prevents PEX amplification attacks where an attacker floods the network with PEX requests to harvest the entire peer graph.

API

PeerExchange pex;

// Build an empty PEX request
size_t len = pex.build_request(buffer, buf_len);

// Build a PEX response with up to 32 entries
size_t len = pex.build_response(peer_manager, buffer, buf_len);

// Check if we can respond to this peer (30s rate limit)
bool allowed = pex.check_rate_limit(peer_id, now_ms);

// Parse a received PEX response
std::vector<PexEntry> entries;
size_t count = pex.parse_response(payload, len, entries);