High-performance CLI tool (cs) for analyzing code statistics across 460+ programming languages. Counts files, lines, and bytes by language while respecting .gitignore and offering flexible output formats.
cargo build --release # Build optimized binary
cargo test # Run fixture-based integration tests
cargo clippy # Lint (all warnings are errors)
cargo fmt # Format codeThe analysis pipeline follows these stages:
- File Discovery (
analyzer.rs) - Parallel gitignore-aware tree walking viaignorecrate - I/O Strategy (
file_io.rs) - Buffered for small files, mmap for files ≥256KB - Encoding Detection (
encoding.rs) - UTF-8/UTF-16 detection, binary file filtering - Line Classification (
line_classifier.rs) - Code/comment/blank/shebang categorization - Statistics Aggregation (
stats.rs) - Thread-local accumulation with merge-on-drop
| Module | Purpose |
|---|---|
src/cli.rs |
CLI argument definitions (clap derive) |
src/config.rs |
TOML config loading and CLI merging |
src/analysis/ |
Core analysis pipeline |
src/langs/ |
Language detection and data |
src/display/ |
Output formatters (JSON, CSV, HTML, Markdown, etc.) |
- Tabs for indentation (not spaces)
- 120 character line limit
- Group imports: std → external → crate
- Merge derives: combine multiple
#[derive()]attributes
All clippy warnings are errors. The codebase enables:
#![warn(clippy::all, clippy::cargo, clippy::nursery, clippy::pedantic, clippy::perf)]
#![deny(warnings)]- Use
anyhow::Result<T>for error propagation - Use
#[must_use]on constructors returning important values - Use
#[inline]sparingly, only for hot paths - Prefer thread-local processing with merge-on-drop over shared locks
- Document complex pipelines at module level
Languages are defined in languages.json5 and compiled to src/langs/data.rs at build time.
To add a language:
- Add entry to
languages.json5(maintain alphabetical order) - Include patterns, optional keywords for disambiguation
- Add test fixture in
tests/fixtures/<language>/ - Run
cargo testto validate
Tests validate the binary output against fixture files with embedded expected counts:
tests/fixtures/
├── rust/
├── python/
├── bash/
└── ... (67+ languages)
Each fixture contains special comments with expected line counts that the test harness parses and validates against actual output.
Key crates:
clap- CLI parsingignore- Gitignore-aware directory walkingmemmap2- Memory-mapped I/Oencoding_rs- Character encoding detectionaskama- Template rendering (HTML/Markdown output)serde/serde_json- Serialization
default = ["html", "markdown"]html- Enable HTML report outputmarkdown- Enable Markdown report output