bdstorage is a speed-first, local file deduplication engine designed to maximize storage efficiency using tiered BLAKE3 hashing and Copy-on-Write (CoW) reflinks. We welcome contributions from the community—whether it's reporting bugs, proposing new features, or submitting pull requests.
This document provides guidelines and instructions for contributing to this project.
- Code of Conduct
- How Can I Contribute?
- Local Development Setup
- Development Workflow
- Coding Guidelines
- Architecture Overview
By participating in this project, you agree to abide by our Code of Conduct. We expect all contributors to maintain a respectful and inclusive environment for everyone.
If you find a bug, please create an issue on our GitHub repository. When reporting a bug, please include:
- Your operating system and version.
- The filesystem you are using (e.g., Btrfs, ext4, XFS).
- The exact command you ran.
- The expected behavior vs. the actual behavior.
- Any relevant logs, panic backtraces, or error messages.
Have an idea for a new feature or a way to optimize the hashing pipeline? We'd love to hear it!
Open an issue and use the "Enhancement" label if possible. Please provide a clear description of the feature, why it's needed, and how it aligns with bdstorage's speed-first philosophy.
- Fork the repository and create your branch from
main. - Name your branch descriptively (e.g.,
feat/add-new-hasherorfix/vault-transfer-bug). - If you've added code that should be tested, add tests.
- Ensure the test suite passes.
- Format your code using
cargo fmtand check for lints usingcargo clippy. - Issue that pull request!
To build and test bdstorage locally, you will need the standard Rust toolchain.
- Install Rust: If you haven't already, install Rust using rustup.
- Clone the repository:
git clone [https://github.com/Rakshat28/bdstorage.git](https://github.com/Rakshat28/bdstorage.git) cd bdstorage - Build the project:
cargo build
- Run the project:
cargo run -- --help
Note: Since bdstorage uses Linux-specific APIs (like fiemap ioctls) for sparse file optimization, developing on a Linux environment is highly recommended.
Before submitting a Pull Request, please ensure your changes pass the standard Rust quality checks:
- Formatting: We follow standard Rust formatting rules.
cargo fmt --all
- Linting: Ensure there are no Clippy warnings.
cargo clippy --all-targets --all-features -- -D warnings
- Testing: Run the test suite to ensure no existing functionality is broken.
cargo test
- Error Handling: Use the
anyhowcrate for error propagation. Always add descriptive context to errors using.with_context(|| "description"). - Performance:
bdstorageis designed to be extremely fast. Be mindful of disk I/O, memory allocations, and expensive system calls. Avoid reading full file contents unless absolutely necessary (rely on the tiered sparse-hashing pipeline). - Atomicity: Any filesystem operations (moving, renaming, creating vault entries) must be atomic. Do not leave partial files in the
.imprintstore. - Safety: Minimize the use of
unsafecode blocks. When interacting with C APIs (likeioctl), heavily document why theunsafeblock is required and why it is safe in that context.
If you are new to the codebase, here is a quick primer on how things are structured in src/:
main.rs: The CLI entry point, argument parsing viaclap, and concurrent coordination.scanner.rs: Logic for walking directories and initially grouping files by byte size.hasher.rs: Implementation of the tiered hashing logic (sparse hashing vs. full BLAKE3 hashing).dedupe.rs: Core logic for reflinking, hard linking, and restoring files.vault.rs: Manages the local Content-Addressable Storage (CAS) hidden in~/.imprint/store.state.rs: The embeddedredbdatabase integration for tracking file metadata and refcounts.
Thank you for contributing to bdstorage! Your efforts help make this tool faster, safer, and better for everyone.