Notes taken for research and development purposes.
- tokio native helper to simulate various network conditions: https://crates.io/crates/tokio-netem
Iroh-docs comes with a builtin capability model with Write and Read operations per Namespace. In scoping the namespaces appropriately, this model is flexible enough for all our use-cases.
TODO: map out namespaces and scopes for each component
A system with asynchronous processing implicates consistency trade-offs. The aim with this section is to surface and document these alongside their influence on current architectural decisions.
All admin operations, such as submitting an updated or initial configuration for a device, or changing the administrator constellation, are subject to be logged in a tamper-proof audit trail.
The audit trail requires to record all administrator operations in an append-only fashion across all coordinators. This forms an effective transaction boundary between the admin operation and the accompanying audit trail record.
Avoiding a single-point-of-failure in the coordinator functionality of the fleet is desired in the future.
The design can be guided by the PACELC design principle, which clearly delineated trade-offs under normal and partitioned networking conditions. We can distinguish different requirements for different types of data in the system.
The following list discusses various approaches and their trade-offs:
- With the number of coordinators ≥ 3, it's feasible to use a distributed storage engine that relies on a consensus algorithm such as Raft to have immediate consistency. In this scenario there needs to be at least 2 coordinators online for write operations.
- Using an eventual consistency model would allow even offline-write operations, at the cost of having intermittently inconsistent state. It's to be evaluated whether such a model can be a fit for the architecture requirements.
Each fleet relies on persisted data for its operation which is at the Coordinator's responsibility.
- Agents
- Id
- Type
- Status
- History of facts
- Artifact graph
- Reference to current position in the artifact graph
- Metadata
- Name?
- Owner?
- Enrolling Agents
- Type
- History of submitted facts
- Enrolled Agents
- Type
- History of submitted facts
- Credentials map
- Audit logs
At first all artifacts are actually nix build outputs. It's worth considering that all artifacts could be content addressable and the fact that it's a nix build output could be irrelevant.
The artifact storage should allow to attach metadata to the artifact.
Question: is there any use-case that's not covered by a content-addressed store?
Could the content-addressed store be embedded in the coordinator process or should it run separately?
We certainly want to store Nix build outputs somewhere and need to make them retrievable by the Agent.
As it's a requirement that the Agent doesn't have to evaluate its configuration, it's feasible to drop nix as a runtime dependency on the Agent altogether.
This frees up the choice of protocols for transferring the artifacts, e.g. there's no need to stay within the limitations of the Nix Binary Cache protocol
-
Currently Write/Insert Only, No GC
-
Could CAStore be embedded in the rust application and use iroh for transport?
-
What is the consensus mechanism among CAStore nodes, if any?
-
Snix Store Copy command impl: https://git.snix.dev/snix/snix/src/commit/6b08b3382f68417111a15721be2c79e75b0d0c23/snix/store/src/bin/snix-store.rs#L359
-
nar-bridge https://git.snix.dev/snix/snix/src/branch/canon/snix/nar-bridge that redirects from Nix HTTP Cache protocol into Snix store. could be reused to redirect into iroh-docs/iroh-blobs