Skip to content

Commit 198f478

Browse files
authored
Add multi-step integration and analysis modules (#43)
* Add multi-step integration and analysis modules with docs and examples * Bump version to 1.0.0 * Refactor string formatting to use Rust's format string syntax * Fix variable naming in examples and extend test with metadata fields * Document multi-step analysis and fallback strategy * Refactor error handling and remove unused code allowances * Add fallback to sort impact scores with Ordering::Equal on None * Fix unwrap usage by handling empty iterator cases with errors * Refactor Args and Command structs, fix reasoning logic, minor FS change * Add Rust cleanup and formatting rules to .cursor config * Remove unused git_hooks_path method from Filesystem * Document multi-step analysis and Justfile usage in CLAUDE.md * Improve README.md with badges and project highlights * Correct age validation to include 18-year-olds in config * Fix diff content truncation to ensure valid UTF-8 boundaries * Format char_indices() calls for readability in multi_step_integration.rs * Strengthen atomic operations by updating Ordering for thread safety * Fix division by zero when calculating avg_impact for empty file list * Refactor token estimation heuristics for improved performance * Add tests for Model::GPT4 token counting accuracy * Use div_ceil for midpoint calculation in Model * Fix UTF-8 slicing in truncate_to_fit to prevent invalid boundaries * Add structure rule for nested Rust module enforcement
1 parent 32cf616 commit 198f478

35 files changed

+6642
-1234
lines changed

.cursor/rules/cleanup.mdc

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
---
2+
description: Full repository cleanup for Rust projects. Recursively remove dead code, unused dependencies, outdated docs, and other cruft from a Cargo application safely, with minimal impact on public APIs. Uses automated tools and best practices to produce a single clean-up commit.
3+
globs:
4+
alwaysApply: false
5+
---
6+
7+
# Rust Full Cleanup
8+
9+
This rule performs a thorough **cleanup of a Rust Cargo repository**. It traverses the entire codebase (not just recent diffs) to detect and remove any unnecessary code, dependencies, or documentation. The goal is to eliminate dead weight and outdated content, making the project leaner and more maintainable without altering the public-facing API or breaking functionality. Specifically, it targets:
10+
11+
- **Dead or unused code** – functions, structs, modules, or symbols that are never called.
12+
- **Unused dependencies** – crates listed in **Cargo.toml** that are not actually referenced in code.
13+
- **Outdated documentation** – content in `README.md`, `docs/` files, module comments, or rustdoc that no longer reflects the code.
14+
- **Obsolete tests or examples** – test cases or example code referring to removed or non-existent functionality.
15+
- **Legacy files or placeholders** – stray configuration files, stubs, or artifacts that no longer serve a purpose.
16+
- **General clutter** – anything that adds complexity without contributing to core functionality.
17+
18+
## Tools and Approach
19+
20+
To achieve this cleanup safely and effectively, the rule leverages several Rust tools and checks:
21+
22+
- **cargo-udeps** – Use `cargo udeps` (cargo unused dependencies) to scan for dependencies that can be dropped. This will highlight any libraries in Cargo.toml that the project doesn’t actually use, so they can be removed.
23+
- **cargo clippy** – Run `cargo clippy` to catch common mistakes and lint issues. We use **Clippy** to detect dead code (with `#[warn(dead_code)]`) and other potential issues. Clippy’s lints will help identify unused functions or imports and improve overall code quality.
24+
- **cargo fix** – Apply `cargo fix` after reviewing warnings. **Cargo fix** will automatically remove unused imports and apply straightforward fixes for warnings. This helps eliminate trivial dead code (like unused variables or imports) in bulk.
25+
- **cargo deadlinks** – After building documentation (`cargo doc`), run `cargo deadlinks` to find broken links in docs. This flags references in documentation that point to removed code or missing pages, so we can update or remove them.
26+
- **cargo fmt** – Finally, run `cargo fmt` (Rustfmt) to format the code according to Rust’s style guide. This ensures the codebase remains well-formatted after the removals.
27+
- **Tests & Build** – Throughout the process, frequently run `cargo build` and `cargo test` to ensure that removals do not break anything. This double-checks that the code still compiles and all tests pass after cleaning up.
28+
29+
Using these tools in combination provides a safety net: we detect unused items, remove them, clean up documentation, and verify everything still works. The process is iterative and careful, focusing on one category at a time.
30+
31+
## Cleanup Procedure
32+
33+
Following is the step-by-step procedure the rule will execute to perform the full cleanup:
34+
35+
1. **Identify Unused Dependencies:** Start by finding unused dependencies in the project. Run `cargo +nightly udeps --all-targets` to get a report of dependencies not used in production or dev builds. For each dependency flagged as unused, remove it from **Cargo.toml** (and Cargo.lock). Double-check that it’s truly unused (consider conditional features or platform-specific usage). This will slim down the dependency list, reducing build bloat.
36+
37+
2. **Remove Dead Code:** Next, locate dead code within the repository:
38+
39+
- Enable warnings for unused or dead code (ensure no `#![allow(dead_code)]` or similar is masking them).
40+
- Run `cargo clippy` and `cargo check` to gather warnings about unused functions, methods, or modules. The Rust compiler itself will warn about unused private items by default (`dead_code` warnings).
41+
- For each warning, verify that the item is not used anywhere in the repo (use an IDE “find references” or `grep` through the code to confirm). If confirmed unused, **delete the code** (functions, structs, impl blocks, etc.).
42+
- Pay special attention to public (`pub`) items. **Do not remove public exports or public API** items unless you are certain they are truly unused by any consumer. (In a library crate, something can be “dead” internally but still part of the public API. _As a rule of thumb: you can't remove part of the public API except during major version bumps._)
43+
- Remove any feature-flagged code that is tied to now-removed functionality or dependencies. For example, if an unused dependency was only used in a certain optional feature, consider removing that feature flag and related code if it’s now obsolete.
44+
- Use `cargo fix` to automatically remove trivial unused items. For instance, **cargo fix will remove unused import statements** and other machine-fixable dead code for you. Run `cargo fix --allow-staged` or on a clean working tree to apply these fixes en masse, then review the changes.
45+
- After removals, run `cargo build` to ensure nothing broken. Run `cargo clippy` again to see if more issues surface after the first round of deletions.
46+
47+
3. **Update Documentation:** Once code and dependencies are cleaned up, address the documentation:
48+
49+
- Open **README.md** and any markdown files under `docs/`. Remove or update sections that describe features or modules that were removed. Outdated instructions or references to now-nonexistent code should be excised to prevent confusion.
50+
- Search for API names in docs that might have been deleted. If examples in the README or docs refer to a function that no longer exists, remove those examples or replace them with relevant ones.
51+
- Scan Rustdoc comments in the source (public item docstrings). If they mention behaviors or modules that have been eliminated, update those comments for accuracy.
52+
- Run `cargo doc` to generate the documentation and then `cargo deadlinks` on the `target/doc` output. **Cargo-deadlinks will flag broken intra-doc links** (e.g. link to a struct or module that was removed). For each broken link, either update it to a valid reference or remove it if the item no longer exists. This ensures the documentation doesn’t contain dangling references.
53+
- Also remove any inline documentation examples or tests (often in doc comments as `/// ```rust` blocks) that pertain to removed code.
54+
55+
4. **Eliminate Obsolete Tests and Examples:** Now check the tests and examples directories:
56+
57+
- Remove or update **unit tests or integration tests** that were targeting code you deleted. If a test file entirely tests a now-removed module or feature, delete that test file. If parts of tests reference removed functions, those tests should be removed or refactored accordingly.
58+
- Similarly, if the project has an `examples/` directory or example code in docs, eliminate any example code that no longer runs because the underlying functionality was removed.
59+
- Run `cargo test` after this to confirm that all remaining tests pass and no tests are failing due to missing code.
60+
61+
5. **Delete Legacy Files:** Look for any miscellaneous files that are no longer needed:
62+
63+
- Old migration or config files that are not used, placeholder files (e.g. empty module files, old feature flag toggles, deprecated scripts) – remove them to avoid confusion.
64+
- If the repository has directories or modules that have been completely deprecated (e.g. an old `v1/` API that’s replaced by `v2/` but still lingering), consider removing those entirely, after confirming they are truly unsued.
65+
- Check for files of other types (JSON, YAML, etc.) that might have been related to removed features (for instance, an unused CI config or a data file not referenced anymore) and remove them.
66+
67+
6. **Final Polish (Format & Review):** After all removals:
68+
69+
- Run **cargo fmt** to format the codebase. This will tidy up any indentation or spacing affected by code removals, ensuring the project adheres to standard Rust style (rustfmt formats code according to the Rust style guide).
70+
- Run **cargo clippy** one more time to catch any new lint issues introduced (for example, if removing code made a `use` statement unused, etc., though `cargo fmt`/`fix` likely handled those).
71+
- Perform a full **build and test** run (`cargo build && cargo test`) to ensure the project is in a consistent, working state with all tests passing and no warnings.
72+
- Double-check that no public-facing APIs or interfaces have been unintentionally changed. At this stage, only internal implementation details should have been removed. The external behavior and documented APIs should remain the same (unless the cleanup intentionally deprecated something with proper communication).
73+
74+
7. **Single Commit Summary:** Finally, bundle all these changes into **one commit** (or a single cohesive patch). Compose a clear commit message that summarizes the cleanup actions and rationale. For example:
75+
76+
```
77+
chore: remove unused code, deps, and outdated docs
78+
79+
- Removed unused functions `foo_bar` and `unused_helper` (dead code not referenced anywhere in the project).
80+
- Dropped unused dependency "xyz" from Cargo.toml (no references in code) to slim down build.
81+
- Cleaned up README and docs: removed sections referring to the old ABC module that was deleted.
82+
- Deleted obsolete test `old_feature_test.rs` and example `legacy_demo.rs` which targeted removed code.
83+
- Removed legacy config files `old_config.yml` and placeholder module `unused_mod.rs`.
84+
85+
All changes are internal and do not affect the public API. Project builds and tests pass.
86+
```
87+
88+
Use an **imperative tone** in the commit subject (e.g. “Remove unused X…”). Be specific about what was removed and why (e.g. “dead code”, “unused dependency”). This commit acts as a record for future developers, explaining the cleanup. Ensure all modifications from the above steps are included before committing.
89+
90+
By following this rule, Cursor will systematically clean the Rust project of clutter. The repository’s maintainability improves as we delete dead code (improving readability and reducing confusion), remove unused crates (reducing compile times and potential attack surface), and keep documentation in sync with reality. The cleanup is done **safely**: no public interfaces are touched without deliberate decision, and tests/compilation guard against accidental breakage. The end result is a single comprehensive commit that makes the codebase leaner and easier to work with, without altering its external behavior or API. All changes are confined to removing unneeded elements, thereby simplifying the project in a responsible way.

.cursor/rules/rust.mdc

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
---
2+
description:
3+
globs: **/*.rs,*.rs
4+
alwaysApply: false
5+
---
6+
7+
# Formatting
8+
9+
- Enforce `rustfmt.toml` with `edition = "2021"` and `max_width = 100`
10+
unless a project-local file overrides. Never push unformatted code.
11+
- Keep imports grouped: std ▸ external ▸ internal, then alphabetised.
12+
13+
# Naming
14+
15+
- snake_case for items and functions; SCREAMING_SNAKE_CASE for consts;
16+
PascalCase for types and traits; crate names are kebab-case on crates.io.
17+
- Prefer expressive verbs for functions and nouns for types.
18+
19+
# Module & file organisation
20+
21+
- One public type per file where practical; sibling modules live in a
22+
directory with `mod.rs` or the newer `mod foo;` inline split file
23+
form. Avoid deep trees (>3 levels). Public re-exports go in
24+
`lib.rs` so the crate has a clean surface.
25+
26+
# Error handling
27+
28+
- Bubble typed errors with `thiserror`; erase at API boundaries with
29+
`anyhow::Result<T>` for binaries.
30+
- Use `?` eagerly; avoid `unwrap` and `expect` in library code.
31+
32+
# Linting
33+
34+
- Clippy runs in CI with at least:
35+
deny = [clippy::correctness, clippy::needless_bool,
36+
clippy::unwrap_used, clippy::expect_used]
37+
- Allow unused code only behind `#[cfg(test)]`.
38+
39+
# Concurrency
40+
41+
- Favour message-passing (`tokio::sync::mpsc`) over shared mutability.
42+
- Keep `unsafe` blocks tiny; wrap them in safe abstractions with
43+
doc-comment `// SAFETY:` explanations.
44+
45+
# Generics & traits
46+
47+
- Keep public generics bounded (`T: Read + Send + 'static`), avoid
48+
unconstrained `impl Trait` in return positions for libraries.
49+
- Implement `From<T>`/`Into<T>` rather than ad-hoc converters.
50+
51+
# Testing
52+
53+
- Each module gets `#[cfg(test)] mod tests { use super::*; … }`.
54+
- Integration tests live in `tests/` and use only the public API.
55+
56+
# Performance & build
57+
58+
- Use `cargo check` in on-save hooks; enable `-Zthreads=N` on nightly
59+
for large crates to shorten feedback loops.
60+
- Gate expensive features behind `cfg(feature = "heavy")`.
61+
62+
# Documentation
63+
64+
- Every public item has a triple-slash summary line, followed by
65+
examples guarded by `rust`, `no_run` or `compile_fail`.
66+
- Rendered docs must pass `cargo doc --warnings`.
67+
68+
# Workspaces & dependency hygiene
69+
70+
- Use a root-level Cargo workspace to share one lockfile; commit it for
71+
apps, decide per RFC c-lock for libs.
72+
- Keep third-party deps in the fewest indirect copies; audit with
73+
`cargo deny`.
74+
75+
# Unsafe code boundaries
76+
77+
- Mark every `unsafe fn` with `#[forbid(unsafe_op_in_unsafe_fn)]`.
78+
- Document invariants and require proof obligations in comments.
79+
80+
# EOF

.cursor/rules/structure.mdc

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
---
2+
description: Enforce a nested module file structure in Rust crates. Split each underscored filename into directory-based modules, adjust imports and declarations, verify builds, and uphold the Law of Demeter by keeping modules cohesive and low-coupled. Warn on future violations.
3+
globs:
4+
alwaysApply: false
5+
---
6+
7+
- modular structure
8+
Move any `*_*.rs` file to a nested path cut at the first underscore.
9+
`animal_dog.rs` → `animal/dog.rs`. Keep suffixes like `_test.rs`
10+
(`foo_bar_test.rs` → `foo/bar_test.rs`). Multiple underscores add
11+
levels: `one_two_three.rs` → `one/two/three.rs`.
12+
13+
- update imports and mods
14+
After moving, rewrite every `mod`, `use`, and `#[path]`. Create
15+
parent files with the 2018+ style (`foo.rs` holding `pub mod bar;`)
16+
instead of `foo/mod.rs`.
17+
18+
- law of demeter / cohesion
19+
Search for deep paths (`crate::x::y::z::…`). For each:
20+
21+
- move the item nearer its callers, or
22+
- add a local façade (`pub use`) to shorten the path.
23+
Goal: most `use` lines have ≤ 2 segments. Run
24+
`cargo clippy -W needless_qualified_path -W module_inception` and fix
25+
remaining warnings. If two modules cross-import heavily, merge them
26+
under a common parent folder.
27+
28+
- build guardrails
29+
Execute `cargo check` and `cargo test` before and after the refactor;
30+
stop or prompt if either fails. Clippy must pass with `-D warnings`.
31+
32+
- ignore generated files
33+
Skip everything in `target/` or paths matched by `.gitignore`.
34+
35+
- enforce convention
36+
Flag any new `.rs` file containing an underscore (except `_test.rs`);
37+
advise creating a folder plus file instead.

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,5 @@ bin/
88
tmp/
99
finetune_verify.jsonl
1010
finetune_train.jsonl
11+
12+
**/.claude/settings.local.json

0 commit comments

Comments
 (0)