Contributing to RLM-RS

Thank you for your interest in contributing to rlm-rs! This document provides guidelines and instructions for contributing.

Code of Conduct

Please be respectful and constructive in all interactions. We welcome contributors of all backgrounds and experience levels.

Getting Started

Prerequisites

Rust 1.88+ (2024 edition)
cargo-deny for supply chain security checks

# Install Rust (if needed)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Install cargo-deny
cargo install cargo-deny

Setting Up the Development Environment

# Clone the repository
git clone https://github.com/zircote/rlm-rs.git
cd rlm-rs

# Build the project
cargo build

# Run tests
cargo test

# Run the full CI check
make ci

Development Workflow

Branch Strategy

main - Stable release branch
Feature branches - feature/<description>
Bug fixes - fix/<description>

Making Changes

Fork the repository and create a branch from main
Write your code following the style guidelines below
Add tests for any new functionality
Run the full CI check before submitting:

make ci

This runs:

cargo fmt -- --check (formatting)
cargo clippy --all-targets --all-features (linting)
cargo test (tests)
cargo doc --no-deps (documentation)
cargo deny check (supply chain)

Commit Messages

Use clear, descriptive commit messages:

<type>: <short description>

<optional longer description>

Types:

feat: New feature
fix: Bug fix
docs: Documentation changes
refactor: Code refactoring
test: Adding or updating tests
chore: Maintenance tasks

Examples:

feat: add parallel chunking strategy

fix: handle UTF-8 boundary in fixed chunker

docs: update API reference for Storage trait

Code Style Guidelines

General Rules

Line length: 100 characters maximum
Edition: Rust 2024
Unsafe code: Forbidden unless explicitly justified with comments
Panics: Not allowed in library code (unwrap, expect, panic!)

Formatting

Code must pass cargo fmt:

# Check formatting
cargo fmt -- --check

# Auto-format
cargo fmt

Linting

Code must pass strict clippy lints:

cargo clippy --all-targets --all-features -- -D warnings

The project uses pedantic and nursery lints. Key rules enforced:

Lint	Rule
`unwrap_used`	Denied - use `Result` instead
`expect_used`	Denied - use `Result` instead
`panic`	Denied - handle errors gracefully
`todo`	Denied - complete implementation
`dbg_macro`	Denied - remove debug macros
`print_stdout`	Denied - use proper logging

Error Handling

Always use Result types for fallible operations:

// Good
pub fn parse(input: &str) -> Result<Value, ParseError> {
    if input.is_empty() {
        return Err(ParseError::EmptyInput);
    }
    Ok(value)
}

// Bad - panics
pub fn parse(input: &str) -> Value {
    input.parse().unwrap() // Never do this
}

Use thiserror for custom error types:

use thiserror::Error;

#[derive(Error, Debug)]
pub enum MyError {
    #[error("invalid input: {0}")]
    InvalidInput(String),

    #[error("operation failed")]
    OperationFailed {
        #[source]
        source: std::io::Error,
    },
}

Documentation

All public items must have documentation:

/// Processes the input data according to the configuration.
///
/// # Arguments
///
/// * `input` - The data to process.
/// * `config` - Processing configuration.
///
/// # Returns
///
/// The processed result.
///
/// # Errors
///
/// Returns [`Error::InvalidInput`] if the input is malformed.
///
/// # Examples
///
/// ```rust
/// use rlm_rs::{process, Config};
///
/// let result = process("data", &Config::default())?;
/// assert!(!result.is_empty());
/// # Ok::<(), rlm_rs::Error>(())
/// ```
pub fn process(input: &str, config: &Config) -> Result<Output, Error> {
    // implementation
}

Ownership and Borrowing

Prefer borrowing over ownership when possible:

// Good - borrows
pub fn process(data: &[u8]) -> Vec<u8> { ... }

// Avoid - takes ownership unnecessarily
pub fn process(data: Vec<u8>) -> Vec<u8> { ... }

Const Functions

Use const fn where possible:

#[must_use]
pub const fn new() -> Self {
    Self {
        size: DEFAULT_SIZE,
        overlap: DEFAULT_OVERLAP,
    }
}

Testing

Test Organization

Unit tests: Inside src/*.rs with #[cfg(test)] modules
Integration tests: tests/ directory
Doc tests: Examples in documentation

Writing Tests

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_success_case() {
        let result = function_under_test(valid_input);
        assert_eq!(result, expected_output);
    }

    #[test]
    fn test_error_case() {
        let result = function_under_test(invalid_input);
        assert!(matches!(result, Err(Error::InvalidInput(_))));
    }
}

Property-Based Testing

For complex invariants, use proptest:

use proptest::prelude::*;

proptest! {
    #[test]
    fn chunk_boundaries_valid(content in ".{1,1000}") {
        let chunker = FixedChunker::with_size(100);
        let chunks = chunker.chunk(1, &content, None).unwrap();
        for chunk in chunks {
            prop_assert!(chunk.byte_range.end <= content.len());
        }
    }
}

Running Tests

# Run all tests
cargo test

# Run with output
cargo test -- --nocapture

# Run specific test
cargo test test_name

# Run tests for a specific module
cargo test chunking::

Adding New Features

Adding a New Chunking Strategy

Create src/chunking/my_strategy.rs:

use crate::chunking::traits::{Chunker, ChunkMetadata};
use crate::core::Chunk;
use crate::error::Result;

pub struct MyChunker {
    chunk_size: usize,
}

impl MyChunker {
    #[must_use]
    pub const fn new() -> Self {
        Self {
            chunk_size: super::DEFAULT_CHUNK_SIZE,
        }
    }
}

impl Chunker for MyChunker {
    fn chunk(
        &self,
        buffer_id: i64,
        text: &str,
        metadata: Option<&ChunkMetadata>,
    ) -> Result<Vec<Chunk>> {
        // Implementation
    }

    fn name(&self) -> &'static str {
        "my-strategy"
    }

    fn description(&self) -> &'static str {
        "Description of my chunking strategy"
    }
}

Export in src/chunking/mod.rs:

pub mod my_strategy;
pub use my_strategy::MyChunker;

Add to create_chunker factory function.
Add tests and documentation.

Adding a New CLI Command

Add variant to Commands enum in src/cli/parser.rs:

#[derive(Subcommand, Debug)]
pub enum Commands {
    // ...existing commands...

    /// Description of my command.
    MyCommand {
        /// Argument description.
        #[arg(short, long)]
        my_arg: String,
    },
}

Implement handler in src/cli/commands.rs:

Commands::MyCommand { my_arg } => {
    // Implementation
    Ok("Output".to_string())
}

Add tests and update CLI reference documentation.

Pull Request Process

Ensure all checks pass:
```
make ci
```
Update documentation if needed:
- README.md for user-facing changes
- docs/ for detailed documentation
- Code comments for internal changes
Create a pull request with:
- Clear title describing the change
- Description of what and why
- Link to related issues (if any)
Address review feedback promptly

PR Checklist

Code follows style guidelines
All tests pass (cargo test)
Clippy passes (cargo clippy)
Format is correct (cargo fmt)
Documentation updated (if needed)
No new warnings introduced

Reporting Issues

Bug Reports

Include:

Rust version (rustc --version)
OS and version
Steps to reproduce
Expected vs actual behavior
Relevant logs or error messages

Feature Requests

Include:

Use case description
Proposed solution (if any)
Alternatives considered

Project Structure

src/
├── lib.rs           # Library entry point
├── main.rs          # CLI entry point
├── error.rs         # Error types
├── core/            # Core domain types
│   ├── buffer.rs    # Buffer type
│   ├── chunk.rs     # Chunk type
│   └── context.rs   # Context/variables
├── chunking/        # Chunking strategies
│   ├── traits.rs    # Chunker trait
│   ├── fixed.rs     # Fixed chunker
│   ├── semantic.rs  # Semantic chunker
│   └── parallel.rs  # Parallel chunker
├── storage/         # Persistence
│   ├── traits.rs    # Storage trait
│   └── sqlite.rs    # SQLite backend
├── io/              # File I/O
│   ├── reader.rs    # File reading
│   └── unicode.rs   # Unicode utilities
└── cli/             # CLI layer
    ├── parser.rs    # Argument parsing
    ├── commands.rs  # Command handlers
    └── output.rs    # Output formatting

tests/
└── integration_test.rs

docs/
├── architecture.md  # Internal architecture
├── cli-reference.md # CLI documentation
└── api.md           # Library API reference

License

By contributing, you agree that your contributions will be licensed under the MIT License.

Questions?

Open an issue for questions
Check existing issues and documentation first

Thank you for contributing!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing to RLM-RS

Code of Conduct

Getting Started

Prerequisites

Setting Up the Development Environment

Development Workflow

Branch Strategy

Making Changes

Commit Messages

Code Style Guidelines

General Rules

Formatting

Linting

Error Handling

Documentation

Ownership and Borrowing

Const Functions

Testing

Test Organization

Writing Tests

Property-Based Testing

Running Tests

Adding New Features

Adding a New Chunking Strategy

Adding a New CLI Command

Pull Request Process

PR Checklist

Reporting Issues

Bug Reports

Feature Requests

Project Structure

License

Questions?

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to RLM-RS

Code of Conduct

Getting Started

Prerequisites

Setting Up the Development Environment

Development Workflow

Branch Strategy

Making Changes

Commit Messages

Code Style Guidelines

General Rules

Formatting

Linting

Error Handling

Documentation

Ownership and Borrowing

Const Functions

Testing

Test Organization

Writing Tests

Property-Based Testing

Running Tests

Adding New Features

Adding a New Chunking Strategy

Adding a New CLI Command

Pull Request Process

PR Checklist

Reporting Issues

Bug Reports

Feature Requests

Project Structure

License

Questions?