RCDS is a scalable string reconciliation protocol designed for distributed systems. This Go implementation provides efficient file synchronization using set reconciliation primitives.
The RCDS algorithm breaks files into content-dependent chunks (shingles) and uses set reconciliation to synchronize data between distributed nodes. This approach is significantly more efficient than traditional file synchronization methods, especially for large files with small differences.
- 🚀 Scalable: Logarithmic complexity with respect to file size
- 🔒 Efficient: Only transfers differences, not entire files
- 🔄 Multiple Algorithms: Supports CPI, Interactive CPI, and IBLT set reconciliation
- 🌐 Distributed: Designed for distributed systems
- 📦 Go Modules: Native Go module support
- ☸️ Kubernetes Ready: CRD support for Kubernetes deployments
- Installation
- Quick Start
- Usage
- Architecture
- Algorithms
- API Documentation
- Kubernetes Deployment
- Contributing
- References
- License
- Go 1.21 or later
- Make (optional, for using Makefile commands)
git clone https://github.com/String-Reconciliation-Ditributed-System/RCDS_GO.git
cd RCDS_GO
make buildThe binary will be available at bin/rcds.
go get github.com/String-Reconciliation-Ditributed-System/RCDS_GOpackage main
import (
"github.com/String-Reconciliation-Ditributed-System/RCDS_GO/pkg/lib/genSync"
"github.com/String-Reconciliation-Ditributed-System/RCDS_GO/pkg/set"
)
func main() {
// Create a new sync instance
sync := // ... initialize your sync algorithm
// Add elements to sync
sync.AddElement("data1")
sync.AddElement("data2")
// Start server
go sync.SyncServer("127.0.0.1", 8080)
// Connect as client
sync.SyncClient("127.0.0.1", 8080)
}# Build the binary
make build
# Run tests
make test
# Run tests with coverage
make test-coverage
# Format code
make fmt
# Run linter
make lint
# Run all checks
make all# Run all tests
go test ./pkg/...
# Run with verbose output
go test -v ./pkg/...
# Run with coverage
go test -coverprofile=coverage.out ./pkg/...
go tool cover -html=coverage.outRCDS uses a layered architecture:
- Application Layer: Command-line interface and user-facing APIs
- Reconciliation Layer: RCDS, Full Sync, and IBLT implementations
- Core Libraries: GenSync interface, hash functions, dictionaries
- Utilities: Set operations, file utilities, type conversions
For detailed architecture documentation, see docs/ARCHITECTURE.md.
RCDS supports multiple set reconciliation algorithms:
The main algorithm that uses content-dependent chunking and hash shingling.
- Complexity: O(log n) with respect to file size
- Best for: Large files with small differences
- Use case: File synchronization in distributed systems
A probabilistic data structure for set reconciliation.
- Complexity: O(d) where d is the number of differences
- Best for: Sets with small symmetric difference
- Use case: Network-efficient reconciliation
Traditional full synchronization (baseline for comparison).
- Complexity: O(n)
- Best for: Small datasets or complete synchronization
- Use case: Initial sync or fallback method
The core interface for all synchronization algorithms:
type GenSync interface {
SetFreezeLocal(freezeLocal bool)
AddElement(elem interface{}) error
DeleteElement(elem interface{}) error
SyncClient(ip string, port int) error
SyncServer(ip string, port int) error
GetLocalSet() *set.Set
GetSetAdditions() *set.Set
GetSentBytes() int
GetReceivedBytes() int
GetTotalBytes() int
}For complete API documentation, run:
godoc -http=:6060Then visit http://localhost:6060/pkg/github.com/String-Reconciliation-Ditributed-System/RCDS_GO/
RCDS can be deployed on Kubernetes using Custom Resource Definitions (CRDs).
kubectl apply -f deploy/crds/kubectl apply -f deploy/operator.yamlSee docs/DEPLOYMENT.md for detailed deployment instructions.
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
- Fork the repository
- Create a feature branch
- Make your changes
- Run tests and linters
- Submit a pull request
If you use this work, please cite the relevant papers:
[1] B. Song and A. Trachtenberg, "Scalable String Reconciliation by Recursive Content-Dependent Shingling"
57th Annual Allerton Conference on Communication, Control, and Computing, 2019
(Allerton)
[2] Y. Minsky, A. Trachtenberg, and R. Zippel,
"Set Reconciliation with Nearly Optimal Communication Complexity",
IEEE Transactions on Information Theory, 49:9.
http://ipsit.bu.edu/documents/ieee-it3-web.pdf
[3] Y. Minsky and A. Trachtenberg,
"Scalable set reconciliation"
40th Annual Allerton Conference on Communication, Control, and Computing, 2002.
http://ipsit.bu.edu/documents/BUTR2002-01.pdf
[4] Goodrich, Michael T., and Michael Mitzenmacher. "Invertible bloom lookup tables."
49th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2011.
arXiv
This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.
This implementation is based on the cpisync project. The original C++ implementation is available at forked cpisync.
For questions, issues, or contributions, please open an issue on GitHub.
Note: This is an active research project. APIs may change as the project evolves.