-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Labels
Description
Gemini summary on how to do this:
You do lazy data syncing with Iroh by default; it's a core feature of the system. 🤙
Iroh's Doc and Store APIs are designed to separate metadata from content. When you sync a document with a peer, you immediately get all the key names and the hashes of their corresponding values, but you don't download the actual content until you explicitly ask for it.
How It Works
The process is straightforward and happens automatically:
Initial Sync: When you join a document's sync swarm using a DocTicket, your node syncs with its peers. This initial sync is very fast because it only exchanges the document's namespace and the entry log. Each entry contains the key, a timestamp, the author, and a hash of the value's content, but not the content itself.
Listing Keys: After the initial sync, you can immediately list all keys in the document. Your node knows about every piece of data available without having downloaded it.
Fetching Content on Demand: When you need the actual data for a specific key, you call a method like doc.get_content_bytes(key). This action triggers Iroh to find a peer that has the content (identified by its hash) and download it.
This "pull-on-read" behavior is the essence of lazy syncing.
Rust Example
Here’s a practical example showing two nodes. Node A creates data, and Node B lazily syncs it.
First, add iroh to your Cargo.toml:
Ini, TOML
[dependencies]
iroh = "0.15.0"
tokio = { version = "1", features = ["rt-multi-thread", "macros"] }
anyhow = "1"
Then, you can run the following code:
Rust
use anyhow::Result;
use iroh::node::{Node, RpcStatus};
use iroh::sync::store::{GetFilter, KeyFilter};
#[tokio::main]
async fn main() -> Result<()> {
// 1. Create Node A (the provider)
let node_a = Node::memory().spawn().await?;
println!("Node A Peer ID: {}", node_a.node_id());
// 2. Create a document and add some data on Node A
let doc_a = node_a.docs.create().await?;
let author_a = node_a.authors.create().await?;
let key = b"my-lazy-key".to_vec();
let value = b"this is the content you will fetch later".to_vec();
// The set_bytes call stores the content and syncs the *hash* of the content.
doc_a.set_bytes(author_a, key.clone(), value).await?;
println!("Node A set key: 'my-lazy-key'");
// 3. Create a ticket to share the document with Node B
let ticket = doc_a.share().await?;
// --- In a real app, you'd send this ticket to another machine ---
// 4. Create Node B and join the document using the ticket
let node_b = Node::memory().spawn().await?;
println!("\nNode B Peer ID: {}", node_b.node_id());
let doc_b = node_b.docs.import(ticket).await?;
println!("Node B joined the document sync swarm.");
// Wait for the initial (metadata) sync to complete.
let _ = doc_b.events().next_rpc(RpcStatus::Finished).await?;
println!("Node B finished initial metadata sync.");
// 5. Node B can see the key immediately, without having the data.
let keys = doc_b.get_many(GetFilter::Key(KeyFilter::All)).await?.count();
println!("Node B sees {} key(s) available.", keys);
// 6. Node B now explicitly fetches the content for the key.
// THIS is the step that triggers the actual data download.
println!("Node B is now fetching the content...");
let content = doc_b.get_content_bytes(&key).await?;
println!("Node B successfully fetched content: '{}'", String::from_utf8(content.to_vec())?);
Ok(())
}
Key Takeaways from the Example
doc.set_bytes(...): When Node A sets the data, it stores the content locally and makes the key and content hash available for syncing.
doc.import(...): When Node B imports the ticket, it connects to Node A and syncs the document's state (the keys and hashes). This is extremely lightweight.
doc.get_content_bytes(...): This is the lazy part. Only when this method is called does Node B find the peer with the corresponding content hash and download the bytes.
This default behavior makes Iroh highly efficient for working with large collections of data where you only need to access a subset at any given time, saving significant bandwidth and storage. 💾➡️☁️
so if I want to sync machine learning models, how would I do that?
Show thinking
Of course. Syncing machine learning models is an excellent use case for Iroh, as it perfectly combines its strengths in handling large binary files and synchronizing metadata.
You would use a combination of Iroh's Blobs for the large model data and Iroh Documents to act as a version-managed "model registry."
This approach allows your devices to be aware of all available model versions without downloading them, and then only fetch the specific large model file they actually need.
The Strategy: Blobs for Data, Docs for Metadata
Here is the step-by-step strategy for syncing ML models with Iroh:
Step 1: Add the Model File as a Blob
First, you treat your model file (e.g., a multi-gigabyte .safetensors or .onnx file) as an Iroh blob. You add it to an Iroh node's blob store, which calculates its unique content identifier, a BLAKE3 hash.
Content-Addressed: The model is identified by the hash of its content. This gives you automatic deduplication; if you add the same model file twice, it's only stored and transferred once.
Large File Handling: Iroh is built to handle arbitrarily large files efficiently.
Step 2: Use a Document as a Model Registry
Next, you use an Iroh Document to track your models. A Document is a key-value store that syncs in real-time. You use it to map human-readable names or versions to the model's hash from Step 1.
The keys in your document can be:
Semantic versions: my-llm:v1.2.0
Deployment stages: image-classifier:production, image-classifier:staging
A/B testing groups: recommendation-engine:group-a, recommendation-engine:group-b
The value for each key is the hash of the corresponding model blob.
Step 3: Sync and Fetch on Client Devices
A client device (like an edge server for inference) joins the sync swarm for your model registry Document.
Metadata First: The device instantly syncs the document, receiving all the key-value pairs. This is extremely fast because it's just a tiny amount of text (e.g., image-classifier:production -> ba7816bf...). The large model file is not transferred yet.
Look Up the Hash: Your application logic reads the document to find the hash for the model it needs. For example, it might always look for the key image-classifier:production.
Fetch the Blob: Using the hash it just retrieved, the client now explicitly tells Iroh to fetch the blob. This is the "lazy" part, where the large model file is finally downloaded from any peer on the network that has it.
Conceptual Rust Example
This example demonstrates the full workflow: a "model provider" adds a model and publishes its hash, and a "client" lazily fetches it.
Rust
use anyhow::Result;
use iroh::node::{Node, RpcStatus};
use iroh::sync::store::{GetFilter, KeyFilter};
use iroh::base::hash::Hash;
#[tokio::main]
async fn main() -> Result<()> {
// --- PROVIDER NODE (e.g., your MLOps server) ---
// 1. Create the provider node
let provider = Node::memory().spawn().await?;
let provider_id = provider.node_id();
println!("Provider Node ID: {}", provider_id);
// 2. Add the ML model to the blob store to get its hash
// (Simulating a large model file with simple bytes)
let model_bytes = vec![1; 1024 * 1024 * 500]; // Simulate a 500MB model file
let model_hash = provider.blobs.add_bytes(model_bytes).await?;
println!("Added model to blob store. Hash: {}", model_hash);
// 3. Use a Document as a model registry
let doc = provider.docs.create().await?;
let author = provider.authors.create().await?;
// 4. Set a key "stable-diffusion:latest" to point to the model's hash
// The value is the hash, converted to bytes for storage in the document.
let key = b"stable-diffusion:latest".to_vec();
doc.set_bytes(author, key.clone(), model_hash.to_bytes()).await?;
println!("Published model hash to the document.");
// 5. Create a ticket to let clients subscribe to this registry
let ticket = doc.share().await?;
// --- CLIENT NODE (e.g., your inference server) ---
println!("\n--- Client Workflow ---");
// 6. Create the client node and import the document
let client = Node::memory().spawn().await?;
println!("Client Node ID: {}", client.node_id());
let doc_client = client.docs.import(ticket).await?;
// 7. Wait for the initial, lightweight metadata sync
let _ = doc_client.events().next_rpc(RpcStatus::Finished).await?;
println!("Client synced the model registry (metadata only).");
// 8. The client reads the registry to find the hash of the model it needs
let entry = doc_client.get_one(key).await?.unwrap();
let content = entry.content_bytes(&doc_client).await?;
let target_hash = Hash::from_bytes(&content)?;
println!("Client read the key 'stable-diffusion:latest' and got hash: {}", target_hash);
// 9. LAZY FETCH: The client now requests the actual model data using the hash.
// This is the step that triggers the large P2P download.
println!("\nClient is now fetching the large model blob...");
let model_content = client.blobs.read_to_end(target_hash).await?;
println!("Client successfully downloaded the model! Size: {} bytes", model_content.len());
Ok(())
}
Why this is a powerful approach for ML Models:
Efficiency at the Edge: Inference devices only ever download what they need, saving massive amounts of bandwidth. They can be aware of 100 model versions but only store the one they use.
Atomic Updates: To deploy a new model, you just update a single key-value pair in the document (e.g., change the hash for image-classifier:production). This is a tiny, atomic operation that propagates instantly. All clients will see the update and can decide when to fetch the new version.
Resilience and Speed: The P2P nature of Iroh means a client can fetch the model data from any peer that has it—be it the original provider or another client that has already downloaded it. This can dramatically speed up distribution.
Built-in Deduplication: If you have multiple models that share layers or are incremental updates, Iroh's underlying data structures can efficiently handle the redundant parts (though this is a more advanced use case related to bao format). The simplest blob-per-model approach already gives you file-level deduplication.