A decentralized GPU-sharing system for machine learning workloads that enables peer-to-peer resource sharing, secure execution, and incentive mechanisms for GPU providers.
This project creates a network of nodes that can share GPU resources for machine learning tasks. Users can submit ML tasks to the network, and GPU providers earn rewards for executing these tasks. The system uses a decentralized architecture to ensure fairness, security, and efficiency.
- Rust (v2021) - Core programming language
- libp2p (v0.55.0) - Peer-to-peer networking framework
- TCP transport
- Noise protocol for encryption
- Kademlia DHT for peer discovery
- YAMUX for stream multiplexing
- Tokio (v1.44.2) - Asynchronous runtime
- Serde (v1.0.219) - Serialization/deserialization
- TOML (v0.8) - Configuration format
- PyO3 (v0.24.1, optional) - Python integration for ML workloads
- Bollard (v0.18.1, optional) - Docker API for containerized execution
- Wasmtime (v31.0.0, optional) - WebAssembly sandbox for secure execution
- Tracing (v0.1) - Logging and diagnostics
- P2P Resource Discovery - Find available GPU resources across the network
- Smart Task Scheduling - Match ML tasks to optimal GPU providers
- Economic Incentives - Reward system for GPU providers based on contribution
- Secure Execution - Sandboxed environments for ML tasks
- Distributed Ledger - Track resource contributions and payments
- Multiple Execution Backends:
- Native (Python)
- Containerized (Docker)
- WebAssembly (for lightweight tasks)
decentralized-gpu-ml/
├── config/ # Configuration files
├── src/
│ ├── bin/ # Binary executables
│ │ ├── node.rs # Node implementation
│ │ └── cli.rs # Command-line interface
│ ├── network/ # P2P networking
│ │ ├── p2p.rs # libp2p implementation
│ │ ├── discovery.rs # Peer discovery
│ │ ├── ledger.rs # Network ledger
│ │ └── rewards.rs # Network rewards
│ ├── scheduler/ # Task scheduling
│ │ ├── dispatcher.rs # Task dispatcher
│ │ └── task.rs # Task management
│ ├── ml/ # Machine learning
│ │ ├── executor.rs # ML task execution
│ │ ├── ledger.rs # ML ledger
│ │ └── rewards.rs # ML rewards
│ ├── incentives/ # Economic incentives
│ │ ├── node.rs # Node incentives
│ │ ├── task.rs # Task incentives
│ │ └── transaction.rs # Transactions
│ ├── models/ # Data models
│ └── utils/ # Utilities
│ ├── config.rs # Configuration
│ └── logging.rs # Logging
└── examples/ # Example implementations
- Rust (latest stable)
- GPU with CUDA support (for GPU providers)
- Docker (optional, for containerized execution)
-
Clone the repository:
git clone https://github.com/onchana01/decentralized-gpu-ml.git cd decentralized-gpu-ml -
Build the project:
cargo build --release -
Enable optional features:
# For Python ML integration cargo build --release --features ml-python # For Docker container support cargo build --release --features ml-docker # For WebAssembly sandbox cargo build --release --features sandbox-wasm
To start a node with the default configuration:
cargo run --bin node
Or with a custom configuration:
cargo run --bin node -- --config /path/to/config.toml
The CLI allows interaction with the network:
# Submit a new ML task
cargo run --bin cli -- submit --model resnet50 --data /path/to/data
# Query available GPU resources
cargo run --bin cli -- list-resources
# Check task status
cargo run --bin cli -- status --task-id <task_id>
The system uses TOML files for configuration:
config/node.toml- Node configurationconfig/network.toml- Network configuration
This project is licensed under the MIT License - see the LICENSE file for details.
- Author: Abner [email protected]
- Repository: https://github.com/onchana01/decentralized-gpu-ml# For Python ML integration cargo run --bin node --features ml-python
cargo run --bin node --features ml-docker
cargo run --bin node --features sandbox-wasm