Birda is a Rust CLI tool for analyzing audio files using BirdNET and Google Perch AI models. It uses the birdnet-onnx crate as its inference library.
- Language: Rust 1.92, Edition 2024
- Inference: birdnet-onnx (local crate at ../rust-birdnet-onnx)
- Audio Decoding: symphonia
- Resampling: rubato
- CLI: clap with derive
- Config: toml + serde
- Async: tokio
- Logging: tracing
All code MUST pass these checks before commit:
cargo fmt --check
cargo clippy -- -D warnings
cargo testThe following lints are enforced in Cargo.toml:
[lints.rust]
unsafe_code = "deny"
missing_docs = "warn"
[lints.clippy]
correctness = { level = "deny", priority = -1 }
pedantic = { level = "warn", priority = -1 }
nursery = { level = "warn", priority = -1 }
cargo = { level = "warn", priority = -1 }
# Restriction lints
unwrap_used = "warn"
expect_used = "warn"
panic = "warn"
todo = "warn"
unimplemented = "warn"
dbg_macro = "warn"
# Allow these where pedantic is too strict
module_name_repetitions = "allow"
similar_names = "allow"
too_many_lines = "allow"
must_use_candidate = "allow"
missing_errors_doc = "allow"
missing_panics_doc = "allow"WRONG:
if sample_rate == 48000 {
chunk_size = 144000;
}RIGHT:
const SAMPLE_RATE_V24: u32 = 48_000;
const CHUNK_SAMPLES_V24: usize = 144_000;
if sample_rate == SAMPLE_RATE_V24 {
chunk_size = CHUNK_SAMPLES_V24;
}All constants MUST be defined in a dedicated constants.rs module or as associated constants on relevant types.
All external inputs MUST be validated:
- CLI arguments: Use clap's built-in validation (value_parser, range)
- Config files: Validate after parsing, return descriptive errors
- Audio files: Validate format, sample rate, channel count before processing
- File paths: Check existence, permissions, validate against path traversal
Example:
fn validate_confidence(value: f32) -> Result<f32, ValidationError> {
if !(0.0..=1.0).contains(&value) {
return Err(ValidationError::ConfidenceOutOfRange { value });
}
Ok(value)
}NEVER use:
.unwrap()- use.ok_or()or?operator.expect()- use proper error typespanic!()- returnResultinsteadtodo!()/unimplemented!()- implement or remove
ALWAYS:
- Use
thiserrorfor error type definitions - Provide context with error variants
- Chain errors with
.map_err()or? - Use meaningful error messages
Example:
#[derive(Debug, thiserror::Error)]
pub enum AudioError {
#[error("unsupported sample rate {rate} Hz, expected {expected} Hz")]
UnsupportedSampleRate { rate: u32, expected: u32 },
#[error("failed to open audio file '{path}'")]
OpenFailed {
path: PathBuf,
#[source]
source: std::io::Error,
},
}- Canonicalize paths before use
- Validate paths don't escape intended directories
- Use
Path::join()not string concatenation - Check file permissions before operations
- Use atomic file creation (
O_CREAT | O_EXCL) - Always clean up locks on exit (use RAII guards)
- Store PID/hostname for debugging stale locks
- Validate audio file headers before processing
- Limit resource consumption (max file size, batch size)
- Handle malformed config files gracefully
- Prefer streaming/chunked processing over loading entire files
- Reuse buffers where possible (Vec::clear() + extend)
- Use
Box<[T]>for fixed-size allocations - Avoid unnecessary clones
- Use bounded channels to prevent memory exhaustion
- Keep inference thread hot (producer stays ahead)
- Batch GPU operations appropriately
Before optimizing, measure with:
cargo build --release
perf record ./target/release/birda ...
perf report- One concept per module
- Clear public API (
pubonly what's needed) - Document all public items
- Keep modules under 500 lines
- Unit tests in same file (
#[cfg(test)] mod tests) - Integration tests in
tests/directory - Test error paths, not just happy paths
- Use property-based testing for parsers
- All public items have doc comments
- Include examples in doc comments
- Document panics (if any) and errors
- Keep README.md updated
LEANN is a local, privacy-focused vector database and RAG system optimized for low storage. It uses AST-aware chunking to maintain semantic code boundaries, making it highly effective for finding relevant logic and gathering context in large or unfamiliar codebases without keyword matching.
- Index Name:
birda - Rebuild Index:
fish -c "leann build birda --docs src tests docs scripts installer README.md Cargo.toml Taskfile.yml action.yml about.toml .github/workflows --use-ast-chunking --force" - Search:
fish -c "leann search birda '<query>'"- Fast file/module location (instant) - Ask:
fish -c "leann ask birda '<question>'"- Comprehensive answers with code context (15-37s)
Prefer LEANN for:
- Semantic/exploratory searches: "How does audio resampling work?"
- Architecture questions: "What CLI commands are available?"
- Pattern discovery: "What error handling patterns does the codebase use?"
- Context gathering before implementation: "What output formats are supported?"
- Finding code without knowing exact file names or keywords
Use direct tools (Grep/Glob/Read) for:
- Exact file path reads when you know the location
- Specific symbol searches when you know the name (class definitions, function names)
- Single file content searches
- Quick syntax checks
Good queries:
- "How does the audio resampling work? What library is used?"
- "What are the main CLI commands and subcommands?"
- "What output formats are supported and how are they implemented?"
- "Where is configuration loaded from TOML files?"
- "What error handling patterns are used?"
Less effective:
- Very specific line-level questions (use Read tool instead)
- Queries about code you've already read in the current session
- File existence checks (use Glob instead)
The GUI frontend for Birda is located at ../birda-gui.
- When changing CLI output formats (JSON/CSV), ensure compatibility with the GUI.
- Cross-check
birda-gui/src/api/types.tswhen modifying Rust output structures insrc/output/types.rs. - LEANN also has an index for
birda-guito aid in cross-project navigation.
- Run
cargo fmt && cargo clippy && cargo testbefore commit - One logical change per commit
- Conventional commit messages:
feat:,fix:,refactor:,test:,docs: - Reference issue numbers if applicable
- Use snake_case for all Rust files
- Test files:
tests/integration/<module>_test.rsor inline#[cfg(test)] - Constants: in dedicated
constants.rsor associated with types