Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
216 changes: 31 additions & 185 deletions Cargo.lock

Large diffs are not rendered by default.

9 changes: 6 additions & 3 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -48,11 +48,14 @@ form_urlencoded = { version = "1.2", features = ["alloc"] }
cosmrs = { version = "0.22", features = ["rpc", "bip32"] }
tendermint = "0.40"
tendermint-rpc = { version = "0.40", features = ["http-client"] }
reqwest = { version = "0.12", features = ["json"] }
reqwest = { version = "0.12", default-features = false, features = ["json", "rustls-tls-native-roots"] }
bip32 = { version = "0.5", features = ["mnemonic", "bip39"] }
k256 = { version = "0.13", features = ["ecdsa"] }
prost = "0.13"
tonic = "0.12"
prost = "0.14.1"
tonic = "0.14.2"
tonic-prost = "0.14"
tonic-build = "0.14.2"
tonic-prost-build = "0.14"
rand = "0.8"
base64 = "0.22"
did-key = "0.2.1"
Expand Down
9 changes: 8 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -332,9 +332,16 @@ cargo build --release

**Password for encrypted storage** (checked in order):
1. File: `~/.orbis_password`
2. Environment: `ORBIS_PASSWORD`
2. Environment: `ORBIS_PASSWORD` (development and CI only)
3. Interactive prompt

For production, provision secrets through a secrets manager or mounted file with
owner-only permissions (`0600`). Avoid `ORBIS_PASSWORD` and `ORBIS_SECRET_KEY` in
shell history, container environment blocks, compose files, or images. The sample
Docker compose files use fixed test credentials only for local development; see
[`bin/orbis-node/README.md`](bin/orbis-node/README.md#secure-secret-provisioning)
for the node hardening checklist.

### gRPC Services

- **DKG Service** - Initiate and participate in distributed key generation
Expand Down
2 changes: 1 addition & 1 deletion bin/cli-tool/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ rust-version.workspace = true
[dependencies]
clap = { version = "4.4.18", features = ["derive"] }
tokio = { workspace = true }
tonic = "0.14.2"
tonic = { workspace = true }
proto = { path = "../../crates/proto" }
anyhow = { workspace = true }
futures = { workspace = true }
Expand Down
2 changes: 1 addition & 1 deletion bin/orbis-node/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ proto = { path = "../../crates/proto" }
authn = { path = "../../crates/authn" }
authz = { path = "../../crates/authz" }
bulletin = { path = "../../crates/bulletin" }
tonic = "0.14.2"
tonic = { workspace = true }
tonic-web = "0.14.2"
tower-http = { version = "0.6", features = ["cors"] }
iroh = { workspace = true }
Expand Down
33 changes: 32 additions & 1 deletion bin/orbis-node/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@ The **`orbis-node`** binary is the **ring node**: it exposes **gRPC** APIs for o

**Control plane vs data plane:** Clients talk **gRPC** to one node; nodes talk **QUIC** to each other for DKG/PRE/Sign messages. **`GenericProtocolHandler`** ([`helpers/protocol_handler.rs`](src/helpers/protocol_handler.rs)) implements the `network::ProtocolHandler` receive loop for all three MPC protocols.

Ingress limits are applied in two places:

- The gRPC server caps per-connection request concurrency and HTTP/2 streams in [`src/main.rs`](src/main.rs).
- The P2P router caps inbound concurrent streams per protocol and per-peer stream rate before DKG/PRE/Sign handlers run; values live in [`src/constants.rs`](src/constants.rs).

## Workspace crates

Depends on **`crypto`**, **`network`**, **`local-storage`**, **`proto`**, **`authn`**, **`authz`**, **`bulletin`**, **`common`**, and **`tonic`**.
Expand Down Expand Up @@ -46,10 +51,36 @@ From [`helpers/launch.rs`](src/helpers/launch.rs) (`clap` **`Args`**):

Password and node identity: see **`constants`**, **`get_password`**, **`get_network_key_secret`**, **`derive_secret_key_bytes`** in the same module.

## Secure secret provisioning

`orbis-node` needs two local secret classes: the password used to encrypt local
storage and the node network identity key. Treat both as production secrets.

- Prefer a secrets manager or a read-only mounted file for the storage password.
The file path checked by default is `~/.orbis_password`; set owner-only
permissions (`chmod 600 ~/.orbis_password`) and keep it out of backups that are
not also encrypted.
- Use `ORBIS_PASSWORD` only for local development, CI, or short-lived test
containers. Environment variables are commonly visible through process,
container, crash-report, and orchestration inspection paths.
- Let the node generate and persist its network identity on first start, or
restore a previously encrypted local store. Avoid raw `ORBIS_SECRET_KEY` in
production unless a secret manager injects it directly at process launch and
your runtime prevents environment inspection.
- The checked-in Docker compose files contain fixed `ORBIS_PASSWORD` and
`ORBIS_SECRET_KEY` values for deterministic local tests only. Do not reuse
them for any shared, staging, or production network.
- Back up encrypted local storage and the password together under an operational
key-management policy. Losing either the encrypted share store or its password
can permanently strand a node's DKG/PSS shares.
- Rotate node identity and storage passwords through a maintenance window. A
node identity change must be reflected in bulletin committee metadata before
peers will treat it as the same operational participant.

## In-repo docs

- [`src/dkg/PROTOCOL_FLOW.md`](src/dkg/PROTOCOL_FLOW.md) — DKG session flow (when present).
- **[`src/constants.rs`](src/constants.rs)** — JWT limits, session TTL, timeouts, limits.
- **[`src/constants.rs`](src/constants.rs)** — JWT limits, session TTL, network ingress limits, timeouts, limits.

## Running

Expand Down
21 changes: 21 additions & 0 deletions bin/orbis-node/src/constants.rs
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,27 @@ pub const SIGN_EXPIRATION_CHECK_INTERVAL: Duration = Duration::from_secs(30);
/// connection closes, `open_stream()` fails, and the pool reconnects.
pub const NETWORK_IDLE_TIMEOUT_MS: u32 = 30_000;

/// Maximum concurrently executing inbound P2P handler tasks per protocol.
///
/// The iroh router accepts one QUIC stream per node-to-node request/session.
/// This cap prevents a single protocol from spawning unbounded handler tasks
/// under flood or retry storms. Excess streams are dropped before protocol
/// deserialization.
pub const NETWORK_MAX_CONCURRENT_INBOUND_STREAMS: usize = 1024;

/// Maximum inbound P2P streams accepted from one peer per protocol per second.
///
/// DKG, PRE, and Sign traffic should stay well below this in normal operation;
/// the limit primarily protects handler allocation and downstream crypto work
/// from cheap stream-open floods.
pub const NETWORK_MAX_INBOUND_STREAMS_PER_PEER_PER_SECOND: usize = 512;

/// Maximum in-flight gRPC requests per client connection.
pub const GRPC_CONCURRENCY_LIMIT_PER_CONNECTION: usize = 128;

/// Maximum concurrent HTTP/2 streams per gRPC client connection.
pub const GRPC_MAX_CONCURRENT_STREAMS: u32 = 256;

// ============================================================================
// Peer ID Validation Constants
// ============================================================================
Expand Down
6 changes: 3 additions & 3 deletions bin/orbis-node/src/dkg/PROTOCOL_FLOW.md
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,6 @@ T10 Receive Share from Alice
✅ Phase 1 initiation from start_dkg
✅ Phase 2 initiation when commitments complete
✅ Phase 4 initiation when shares complete
⏳ Commitment deserialization (TODO)
Share deserialization (partially done)
⏳ Proper node_id mapping (currently simplified)
✅ Bounded commitment deserialization and coefficient validation
Share deserialization with recipient/session validation and pending-share replay
✅ Authenticated peer-to-node mapping for current and reshare committees
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ use super::*;
/// generated ours), then checks whether Phase 1 is complete.
pub(in crate::dkg::coordinator) async fn handle_commitment_message<D>(
coord: &DkgCoordinator<D>,
session_id: u64,
session_id: u128,
from_node_id: u32,
commitment: Vec<u8>,
) -> Result<Option<DkgMessage>>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ use super::*;
/// Returns `Ok(None)` — the caller should return this directly from `handle_message`.
pub(in crate::dkg::coordinator) async fn handle_session_init<D>(
coord: &DkgCoordinator<D>,
session_id: u64,
session_id: u128,
threshold: u32,
total_participants: u32,
peer_ids: &[String],
Expand Down
8 changes: 4 additions & 4 deletions bin/orbis-node/src/dkg/coordinator/message_handlers/share.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ use crypto::error::CryptoError;
/// Phase 2 is complete.
pub(in crate::dkg::coordinator) async fn handle_share_message<D>(
coord: &DkgCoordinator<D>,
session_id: u64,
session_id: u128,
from_node_id: u32,
to_node_id: u32,
share_value: Vec<u8>,
Expand Down Expand Up @@ -126,7 +126,7 @@ where

pub(super) async fn receive_and_record_share<D>(
coord: &DkgCoordinator<D>,
session_id: u64,
session_id: u128,
share: DistributedShare<D::ShareValue>,
) -> Result<()>
where
Expand All @@ -146,7 +146,7 @@ where

async fn try_receive_share<D>(
coord: &DkgCoordinator<D>,
session_id: u64,
session_id: u128,
share: DistributedShare<D::ShareValue>,
) -> Result<std::result::Result<(), CryptoError>>
where
Expand All @@ -162,7 +162,7 @@ where

async fn record_accepted_share<D>(
coord: &DkgCoordinator<D>,
session_id: u64,
session_id: u128,
from_node_id: u32,
to_node_id: u32,
) -> Result<()>
Expand Down
26 changes: 15 additions & 11 deletions bin/orbis-node/src/dkg/coordinator/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ use std::sync::Arc;
/// return, panic), `Drop` spawns a background task that releases the entry with
/// `success = false`, so the message can be retried by a reconnecting peer.
struct MessageClaimGuard<D: Dkg + Clone + 'static> {
session_id: u64,
session_id: u128,
from_node_id: u32,
message_type: DkgMessageType,
app_state: Arc<AppState<D>>,
Expand All @@ -64,7 +64,7 @@ struct MessageClaimGuard<D: Dkg + Clone + 'static> {

impl<D: Dkg + Clone + 'static> MessageClaimGuard<D> {
fn new(
session_id: u64,
session_id: u128,
from_node_id: u32,
message_type: DkgMessageType,
app_state: Arc<AppState<D>>,
Expand Down Expand Up @@ -472,7 +472,7 @@ where
/// it. Pass `|_| {}` when no extra initialization is needed.
pub async fn create_session<F>(
&self,
session_id: u64,
session_id: u128,
node_id: u32,
threshold: usize,
total_nodes: usize,
Expand Down Expand Up @@ -501,15 +501,15 @@ where
}

/// Remove a DKG session from state.
pub(in crate::dkg::coordinator) async fn remove_session(&self, session_id: u64) {
pub(in crate::dkg::coordinator) async fn remove_session(&self, session_id: u128) {
self.app_state
.dkg_session_state
.remove_session(&session_id)
.await;
}

/// Store peer IDs for a session (needed for sending messages in later phases).
pub async fn set_peer_ids(&self, session_id: &u64, peer_ids: Vec<String>) {
pub async fn set_peer_ids(&self, session_id: &u128, peer_ids: Vec<String>) {
self.app_state
.dkg_session_state
.set_peer_ids(session_id, peer_ids)
Expand All @@ -529,7 +529,7 @@ where
&self,
peer_id_str: &str,
message: DkgMessage,
session_id: Option<u64>,
session_id: Option<u128>,
) -> Result<()> {
network::send_message_to_peer(self, peer_id_str, message, session_id).await
}
Expand All @@ -547,7 +547,7 @@ where
/// Called by the initiator after `StartDkg`, or by the PSS scheduler.
pub async fn initiate_phase1_commitments(
&self,
session_id: u64,
session_id: u128,
peer_ids: &[String],
) -> Result<()> {
phases::initiate_phase1_commitments(self, session_id, peer_ids).await
Expand All @@ -558,7 +558,7 @@ where
/// Called after each incoming commitment message.
pub async fn check_and_trigger_phase2(
&self,
session_id: u64,
session_id: u128,
peer_ids: &[String],
) -> Result<()> {
phases::check_and_trigger_phase2(self, session_id, peer_ids).await
Expand All @@ -567,21 +567,25 @@ where
/// Phase 2: Generate shares and send them to all peers.
///
/// Called when all commitments have been received.
pub async fn initiate_phase2_shares(&self, session_id: u64, peer_ids: &[String]) -> Result<()> {
pub async fn initiate_phase2_shares(
&self,
session_id: u128,
peer_ids: &[String],
) -> Result<()> {
phases::initiate_phase2_shares(self, session_id, peer_ids).await
}

/// Check if Phase 2 is complete (all shares received) and trigger Phase 4 if so.
///
/// Called after each incoming share message.
pub async fn check_and_trigger_phase4(&self, session_id: u64) -> Result<()> {
pub async fn check_and_trigger_phase4(&self, session_id: u128) -> Result<()> {
phases::check_and_trigger_phase4(self, session_id).await
}

/// Phase 4: Compute final secret share and aggregate public key.
///
/// If this node is node_id == 1, also posts the `RingPayload` to the bulletin.
pub async fn initiate_phase4_completion(&self, session_id: u64) -> Result<()> {
pub async fn initiate_phase4_completion(&self, session_id: u128) -> Result<()> {
phases::initiate_phase4_completion(self, session_id).await
}
}
10 changes: 5 additions & 5 deletions bin/orbis-node/src/dkg/coordinator/network.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ async fn send_on_stream(

async fn get_cached_or_open_stream<D>(
coord: &DkgCoordinator<D>,
session_id: u64,
session_id: u128,
peer_id_str: &str,
) -> Result<(Arc<dyn NetworkConnection>, bool)>
where
Expand Down Expand Up @@ -56,7 +56,7 @@ where

async fn ensure_session_generation<D>(
coord: &DkgCoordinator<D>,
session_id: u64,
session_id: u128,
generation: u64,
) -> Result<()>
where
Expand Down Expand Up @@ -96,7 +96,7 @@ pub(super) async fn send_message_to_peer<D>(
coord: &DkgCoordinator<D>,
peer_id_str: &str,
message: DkgMessage,
session_id: Option<u64>,
session_id: Option<u128>,
) -> Result<()>
where
D: Dkg<
Expand Down Expand Up @@ -409,7 +409,7 @@ mod tests {
let (app_state, remote_peer_id) =
make_fake_app_state("dkg_send_retry_replaces_stream", shared_state.clone()).await;
let coordinator = Arc::new(DkgCoordinator::new(app_state.clone()));
let session_id = 42_u64;
let session_id = 42_u128;

coordinator
.create_session(
Expand Down Expand Up @@ -508,7 +508,7 @@ mod tests {
let (app_state, remote_peer_id) =
make_fake_app_state("dkg_send_stale_session_generation", shared_state.clone()).await;
let coordinator = Arc::new(DkgCoordinator::new(app_state.clone()));
let session_id = 84_u64;
let session_id = 84_u128;

coordinator
.create_session(
Expand Down
Loading
Loading