Skip to content

Conversation

@Byron
Copy link

@Byron Byron commented Nov 24, 2025

This PR was created with Copilot as a basis, but I refactored it. Overall, I think it's very close to the original implementation, and there are many possible improvements beyond that if there was interest.

Overall, switching to gix reduced compile times for release builds to 20s from 25s, and it's only compiling what it needs.
Besides that, gix is more portable and safer by being pure Rust.

And of course, if this is merged and there are questions, I am happy to help :).

Summary by CodeRabbit

  • Refactor

    • Git backend replaced with a new implementation, rebuilding commit traversal, diff extraction and hunk generation; public repository and file-status surfaces updated to the new backend.
  • Bug Fixes

    • Safer handling of large/binary file contents and more reliable commit metadata extraction to reduce diff/lookup errors.
  • Behavior Change

    • Context lines no longer affect animation cursor movement or per-line pacing; only additions/deletions influence cursor progression.
  • Tests

    • Added snapshot testing dependency and expanded diff/hunk generation test coverage.

✏️ Tip: You can customize this high-level summary in your review settings.

@Byron Byron requested a review from unhappychoice as a code owner November 24, 2025 09:12
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 24, 2025

Walkthrough

Replaces git2 with gix (ObjectId-based APIs), rewrites repository open/commit lookup/rev-walk/diff/blob handling and hunk generation, removes LineChangeType::Context handling in animation steps, and drops the catch-all arm in file-tree hunk matching. Adds insta dev-dependency.

Changes

Cohort / File(s) Summary
Dependency Migration
Cargo.toml
Replace git2 with gix = { version = "0.75", default-features = false, features = ["revision","blob-diff","max-performance"] }; add dev-dependency insta = "1.44.1".
Git backend refactor
src/git.rs
Full migration from git2gix: types switched to ObjectId; repo open, rev-parse, commit lookup, rev-walk, commit_cache/commit_range updated; new blob helpers (is_blob_binary, get_blob_content), MAX_BLOB_SIZE guards, generate_hunks, DiffHunkCollector/Sink, and reworked diff/patch extraction; removed FileStatus::Unmodified; added FileStatus::from_change.
Animation hunk processing
src/animation.rs
Removed handling of LineChangeType::Context in generate_steps_for_hunk; context lines no longer move cursor or advance buffer_line.
File tree hunk loop
src/panes/file_tree.rs
Removed default catch-all (_ => {}) arm in match over LineChangeType, relying on exhaustive matching.
Tests / helpers
...tests/*, src/...helpers
Tests and helpers updated to gix-based imports and new utilities; added/adjusted tests for hunks/diffs and blob handling.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant UI as UI / CLI
    participant Repo as GitRepository (gix)
    participant Walk as RevWalk
    participant Store as Object Storage
    participant Hunker as Hunk Generator

    UI->>Repo: open(path) / parse_commit_range(range)
    Repo->>Walk: rev_parse_single(range endpoints) / walk()
    Walk->>Repo: yields ObjectId
    Repo->>Store: object(id) -> decode commit -> tree
    Store-->>Repo: tree / blob ids
    Repo->>Store: is_blob_binary(id) / get_blob_content(id)
    Store-->>Repo: bytes or binary marker
    Repo->>Hunker: generate_hunks(old_content, new_content, algo)
    Hunker-->>Repo: DiffHunk[]
    Repo-->>UI: commits + hunks (apply exclusions)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • Inspect commit range traversal, rev-parse usage, and exclusion/filtering logic.
  • Verify blob binary detection, MAX_BLOB_SIZE guards, encoding handling, and get_blob_content edge cases.
  • Review DiffHunkCollector / Sink and hunk generation for correctness (renames, copies, large/binary blobs).
  • Confirm public API type changes (OidObjectId) are propagated to callers/tests.

Possibly related PRs

Poem

🐇 I swapped my git2 cloak for gix today,
I chase ObjectIds and nibble diffs away.
Context lines I hop past with nimble feet,
I stitch my hunks and gossip to each commit—so sweet.
Tiny paws, big changes, carrots all the way!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Use gix instead of git2' directly and precisely summarizes the main change in the PR, which involves replacing the git2 dependency with gix across all related code.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Byron Byron mentioned this pull request Nov 24, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/git.rs (1)

686-785: Consider adding edge case tests for the unified diff parser.

The manual unified diff parser implementation appears correct and handles the standard format properly:

  • Hunk headers: @@ -old_start,old_count +new_start,new_count @@
  • Addition lines: + (excluding +++ headers)
  • Deletion lines: - (excluding --- headers)
  • Context lines: (space prefix)

Manual parsing can be fragile. Consider adding unit tests that cover edge cases such as:

  • Empty files (no hunks)
  • Files with only additions or only deletions
  • Hunks at the start/end of files
  • Unusual line counts or ranges
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 714357e and d10b204.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • Cargo.toml (1 hunks)
  • src/git.rs (13 hunks)
🔇 Additional comments (16)
src/git.rs (15)

1-11: LGTM!

The import changes correctly map git2 types to their gix equivalents. The imports cover all necessary functionality for repository operations, commit handling, and diff generation.


126-133: LGTM!

The struct fields correctly updated from git2's Oid to gix's ObjectId type. The RefCell wrappers remain appropriate for interior mutability.


156-166: LGTM!

The from_change method correctly maps gix's Change enum variants to FileStatus. The distinction between Rewrite { copy: false } (renamed) and Rewrite { copy: true } (copied) is handled appropriately.


247-255: LGTM!

The repository opening logic correctly migrated to gix::open. Error handling with context is appropriate.


257-267: LGTM!

The commit retrieval logic correctly adapted to gix's API. The two-step process of rev_parse_single followed by object()?.id and find_commit is the appropriate pattern for gix.


269-301: LGTM!

The random commit selection correctly migrated to gix's API. The logic to filter non-merge commits and handle empty repositories is preserved.


303-356: LGTM!

Both next_asc_commit and next_desc_commit correctly maintain their ordering logic while using the updated ObjectId type. The ascending/descending index calculations are correct.


362-430: LGTM!

The range commit methods (next_range_commit_asc, next_range_commit_desc, random_range_commit) correctly adapted to use ObjectId. The ordering and random selection logic is preserved.


493-516: LGTM!

The cache population logic correctly migrated to gix's API. The filtering of merge commits and empty repository handling is preserved.


518-546: LGTM!

The metadata extraction correctly adapted to gix's API. The time parsing logic appropriately handles the git timestamp format (seconds + timezone) by extracting the seconds component. Fallbacks are in place for parsing failures.


548-645: LGTM!

The change extraction logic is comprehensively rewritten for gix's tree diff API. The implementation correctly:

  • Handles initial commits (no parent tree)
  • Uses the callback pattern with for_each_to_obtain_tree
  • Extracts object IDs for all change types
  • Performs binary detection and content retrieval
  • Maintains the exclusion logic for large/generated files
  • Returns the appropriate Action::Continue signal

647-655: LGTM!

The binary detection helper correctly identifies binary files by checking for the null byte (\0) or excessive size. Returning false on error (blob not found) is a reasonable default.


657-666: LGTM!

The blob content retrieval helper correctly handles binary/large files by returning None. The use of from_utf8_lossy is appropriate for potentially non-UTF8 git content.


668-684: LGTM!

The hunk generation correctly uses gix's blob diff API. The approach of generating unified diff text and then parsing it back into structured hunks is dictated by gix's API design. Both the diff text and parsed hunks are returned for different use cases.


432-491: Based on my verification findings, I've identified a critical issue with the TODO comment:

The revision::walk::Platform API was updated to add hide() method, which "finally makes safe traversals possible and is what most people would want to use instead of boundary()." This indicates the method referenced in the TODO may have been unreliable in older versions.

However, the current implementation uses a two-pass manual exclusion approach that is functionally correct, even if less efficient than using the newer hide() API. Without access to the specific gix version used in this codebase, I cannot definitively determine if the TODO is outdated or still applicable.

Update or remove the TODO comment at line 465 once the gix version is confirmed stable.

The manual exclude_set logic is correct but could be simplified if gix version supports the hide() method reliably. Verify your Cargo.toml to confirm the gix version, and if it supports hide(), consider replacing the two-pass approach with a single traversal using .with_hidden([start_oid]) or .hide([start_oid]) depending on the API available.

Cargo.toml (1)

27-27: Version 0.75 is valid, current, and free from known vulnerabilities.

Verification confirms gix version 0.75 (0.75.0) is the latest stable release on crates.io, released November 22, 2025. No security advisories or known vulnerabilities were found in the Rust advisory database.

@Byron Byron force-pushed the git2-to-gix branch 2 times, most recently from f517a1f to cd639a9 Compare November 24, 2025 17:53
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/git.rs (1)

676-775: Refactor to use gix's custom Sink pattern instead of string round-trip parsing.

gix-diff supports implementing a custom Sink trait to collect structured hunks directly, eliminating the inefficient workflow of generating a unified-diff string and re-parsing it. The current generate_hunks function (lines 658-674) uses UnifiedDiffBuilder to produce string output, which is then manually parsed back into DiffHunk structures by parse_unified_diff (lines 676-775).

Implement a custom Sink that collects DiffHunk and LineChange objects directly during the diff operation. This removes the string serialization overhead, the manual parsing logic, and reduces maintenance burden.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d10b204 and cd639a9.

📒 Files selected for processing (1)
  • src/git.rs (13 hunks)
🔇 Additional comments (7)
src/git.rs (7)

156-166: LGTM!

The mapping from gix's Change enum to FileStatus is correct and handles all variants appropriately, including distinguishing between renames (copy: false) and copies (copy: true) in the Rewrite variant.


258-267: LGTM!

The migration to gix's rev_parse_single API is correct. The chain of rev_parse_single(hash)?.object()?.id properly extracts the commit ID from the revision specification.


518-541: LGTM!

The commit metadata extraction correctly uses gix's API with proper error handling and safe UTF-8 conversions with fallback values.


637-656: LGTM!

The binary detection using null byte presence (data.contains(&0)) is a reasonable heuristic for this use case. Combined with the size check, this provides good protection against processing large or binary files.


699-711: LGTM!

The use of unwrap_or with sensible defaults (1 for line numbers) provides defensive error handling. While this could mask malformed diff output, it's acceptable for this use case since gix should generate well-formed unified diffs.


1-11: Well-executed migration with one major issue to address.

The migration from git2 to gix is comprehensive and appears to maintain the original functionality. The new helper methods for binary detection, blob content retrieval, and diff parsing are well-structured. However, please address the missing old_path extraction for renamed/copied files (see comment on line 565), which represents a functional regression.


3-6: The gix version in use is current and secure.

Verification confirms that gix 0.75.0 is the latest available version on crates.io and contains all security patches. The identified security advisories (HIGH and MODERATE severity) affect only older versions (< 0.71.0), all of which have been addressed in the current version.

- avoid `find_object()` in favor of more direct methods.
- Remove `dead_code` in `FileStatus`
- use parse timestamp
- better use of `gix`
- extract `old_path`
- make it work in case of the first commit
…et the hunks.

Implemented by Sonnet 4.5 via Copilot.
That way it should be a bit clearer that it works during review even,
while protecting from regressions.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74bf24f and f148afa.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • Cargo.toml (2 hunks)
  • src/git.rs (12 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Byron
Repo: unhappychoice/gitlogue PR: 97
File: src/git.rs:464-475
Timestamp: 2025-11-24T19:06:43.768Z
Learning: In the gix library (as of version 0.75), the `with_hidden()` method on rev-walk Platform is broken and doesn't always work as expected, so it should be avoided. Use manual exclusion set workarounds instead when filtering commits during revision walks.
📚 Learning: 2025-11-24T19:06:43.768Z
Learnt from: Byron
Repo: unhappychoice/gitlogue PR: 97
File: src/git.rs:464-475
Timestamp: 2025-11-24T19:06:43.768Z
Learning: In the gix library (as of version 0.75), the `with_hidden()` method on rev-walk Platform is broken and doesn't always work as expected, so it should be avoided. Use manual exclusion set workarounds instead when filtering commits during revision walks.

Applied to files:

  • src/git.rs
🔇 Additional comments (11)
Cargo.toml (2)

26-27: gix dependency configuration and features look appropriate for the new usage.

gix = { version = "0.75", default-features = false, features = ["revision", "blob-diff", "max-performance"] } matches the APIs used in src/git.rs (rev-walk, tree diff, blob diff). The added insta dev-dependency is appropriate for the new snapshot tests.

Please run the full test suite (including the new snapshot tests) to confirm there are no missing gix feature flags across all platforms you care about.


84-85: insta dev-dependency is well-scoped to tests.

Using insta as a dev-dependency for inline debug snapshots in tests keeps it out of the runtime dependency tree and is an appropriate choice here.

src/git.rs (9)

126-166: FileStatus::from_change correctly maps gix tree diff variants.

The mapping from Change to FileStatus (Addition/Deletion/Modification/Renamed/Copied) aligns with gix’s diff model, cleanly distinguishing Rewrite { copy: false } as rename and Rewrite { copy: true } as copy.

It’s worth sanity‑checking this against a couple of real commits involving renames and copies to ensure the UI displays the expected status codes.


245-329: DiffHunkCollector implementation matches the intended line-based diff model (no context lines).

The sink:

  • Starts hunks at before.start + 1 / after.start + 1 and tracks old_line_no/new_line_no correctly.
  • Emits only deletions for before and additions for after, with per-line numbers and content as plain strings.
  • Does not preserve context lines, which matches the concurrent removal of Context in the rest of the app.

This is consistent and should integrate cleanly with the animation/file-tree logic.


331-386: Random commit selection via gix rev-walk and non-merge filter looks sound.

Using head_id() plus rev_walk([head]) and filtering to commits with at most one parent preserves the previous “non-merge commits only” behavior. Caching the resulting ObjectIds and randomly sampling from them is straightforward and efficient.

Please verify interactively that the screensaver still picks a diverse set of non‑merge commits (no obvious bias towards very recent or very old commits) across a few different repositories.


517-576: Commit range parsing and manual exclusion set are reasonable; keep avoiding with_hidden().

  • The validation around ... and .. formats gives clear error messages for unsupported or malformed ranges.
  • start/end are resolved via rev_parse_single(...).object()?.id and head_id() when open-ended, which is consistent with typical git semantics.
  • The exclusion set built via rev_walk([start_oid]) and a gix::hashtable::HashSet of reachable ids, then filtered out when walking from end, correctly implements “commits reachable from end but not from start”.
  • The explicit TODO about with_hidden() plus the manual exclude set is the right call given the known issues with with_hidden() in this gix version. Based on learnings, this workaround is intentional and should stay.

One nuance: ranges where start is a descendant of end (e.g. HEAD..HEAD~5) will currently yield an empty set, which may or may not be worth documenting for users.

If you rely on reversed ranges in practice, consider adding a small test (or doc comment) to clarify the behavior for start not ancestrally preceding end.


578-601: populate_cache mirrors random_commit’s traversal logic and is efficient enough.

The cache is built via a single rev-walk from head, skipping commits where find_commit fails and filtering out merges, then stored as Vec<ObjectId>. This matches the selection logic used in random_commit, and the early bail when no candidates exist gives a clear error.


603-626: Commit metadata extraction via gix::Commit decode looks correct.

  • hash from commit.id.to_string() matches the canonical ObjectId.
  • Author name and message are decoded via ByteSlice::to_str() with graceful fallbacks.
  • Timestamp uses author_sig.time()?.seconds and converts to DateTime<Utc> with a safe fallback on invalid timestamps.
  • Changes are populated through the new extract_changes() pipeline.

This should maintain the previous user-facing metadata behavior.

Confirm that author names, timestamps, and messages render identically to the pre‑gix version on a few commits with non‑ASCII data.


727-746: Binary/size heuristics for blobs are consistent and defensive.

is_blob_binary and get_blob_content both:

  • Read blob data once via .data.as_slice().
  • Treat blobs larger than MAX_BLOB_SIZE (500KB) or containing NUL bytes as binary/unsupported for animation.
  • Use from_utf8_lossy for text blobs, avoiding panics on invalid UTF‑8.

This is a good trade‑off for a TUI visualizer.


748-759: generate_hunks correctly bridges gix’s blob diff to internal DiffHunk.

  • old_content/new_content Options are normalized to empty strings when None, which matches the intended semantics (“diff against empty”).
  • The InternedInput + DiffHunkCollector path is straightforward and uses the provided Algorithm enum, allowing caller-controlled algorithms (Myers, Histogram).

This is a clean integration layer and matches the added tests.

Worth verifying that the default algorithm from repo.diff_algorithm() matches the one you expect to use for blob diffs (e.g., Myers vs Histogram) to keep UI output consistent with prior behavior.


884-1488: generate_hunks snapshot tests provide excellent coverage; just ensure snapshot stability.

The new generate_hunks tests:

  • Cover additions, deletions, modifications, mixed operations, middle insert/delete, whitespace-only changes, both-sides-empty/None, and real-code examples.
  • Exercise both Algorithm::Myers and Algorithm::Histogram.
  • Use inline insta::assert_debug_snapshot! snapshots, avoiding external snapshot files.

This gives strong confidence in the diff/hunk translation logic. Just be aware that any change in upstream gix diff behavior or algorithm defaults will update these snapshots.

When updating gix in the future, re-run these tests and carefully inspect any snapshot diffs to distinguish intended upstream changes from regressions in your hunk mapping.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/git.rs (1)

549-560: Exclusion set workaround is appropriate.

Based on the retrieved learning, the with_hidden() method in gix 0.75 is unreliable, so the manual exclusion set approach is the correct solution. The TODO comment appropriately documents this for future revisiting.

🧹 Nitpick comments (1)
src/git.rs (1)

727-746: Consider consolidating blob fetching to avoid redundant lookups.

Both is_blob_binary and get_blob_content fetch the blob independently. In extract_changes, both are called for the same blob ID, resulting in duplicate repository lookups. While gix likely caches these internally, you could optimize by combining them into a single helper that returns (is_binary, Option<String>).

-    fn is_blob_binary(repo: &Repository, id: ObjectId) -> bool {
-        repo.find_blob(id)
-            .ok()
-            .map(|blob| {
-                let data = blob.data.as_slice();
-                data.len() > MAX_BLOB_SIZE || data.contains(&0)
-            })
-            .unwrap_or(false)
-    }
-
-    fn get_blob_content(repo: &Repository, id: ObjectId) -> Result<Option<String>> {
-        let blob = repo.find_blob(id)?;
-        let data = blob.data.as_slice();
-
-        if data.len() > MAX_BLOB_SIZE || data.contains(&0) {
-            Ok(None)
-        } else {
-            Ok(Some(String::from_utf8_lossy(data).to_string()))
-        }
-    }
+    fn get_blob_info(repo: &Repository, id: ObjectId) -> (bool, Option<String>) {
+        let Ok(blob) = repo.find_blob(id) else {
+            return (false, None);
+        };
+        let data = blob.data.as_slice();
+        let is_binary = data.len() > MAX_BLOB_SIZE || data.contains(&0);
+        let content = if is_binary {
+            None
+        } else {
+            Some(String::from_utf8_lossy(data).to_string())
+        };
+        (is_binary, content)
+    }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f148afa and 86cb47e.

📒 Files selected for processing (1)
  • src/git.rs (12 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Byron
Repo: unhappychoice/gitlogue PR: 97
File: src/git.rs:464-475
Timestamp: 2025-11-24T19:06:43.768Z
Learning: In the gix library (as of version 0.75), the `with_hidden()` method on rev-walk Platform is broken and doesn't always work as expected, so it should be avoided. Use manual exclusion set workarounds instead when filtering commits during revision walks.
📚 Learning: 2025-11-24T19:06:43.768Z
Learnt from: Byron
Repo: unhappychoice/gitlogue PR: 97
File: src/git.rs:464-475
Timestamp: 2025-11-24T19:06:43.768Z
Learning: In the gix library (as of version 0.75), the `with_hidden()` method on rev-walk Platform is broken and doesn't always work as expected, so it should be avoided. Use manual exclusion set workarounds instead when filtering commits during revision walks.

Applied to files:

  • src/git.rs
🔇 Additional comments (9)
src/git.rs (9)

1-76: LGTM!

The imports are appropriate for the gix migration, and the constants for blob size limits and exclusion patterns are well-documented with sensible defaults.


156-166: LGTM!

The from_change mapping correctly distinguishes between rename (copy: false) and copy (copy: true) operations from gix's Change::Rewrite variant.


245-329: LGTM!

The DiffHunkCollector implementation correctly converts gix diff events into the internal DiffHunk structure. The 1-indexed line numbering is handled properly, and the Sink trait implementation correctly processes deletions before additions within each change block.


342-352: LGTM!

The commit lookup correctly uses gix's rev_parse_single to resolve the reference, then retrieves the ObjectId and finds the commit. Error handling with ? and .context() is appropriate.


354-441: LGTM!

The commit caching and iteration logic correctly filters out merge commits and handles ascending/descending order. The RefCell borrows are properly dropped before calling extract_metadata_with_changes.


603-626: LGTM!

The commit metadata extraction correctly decodes the gix commit object, extracts author information and timestamp, and handles potential UTF-8 conversion issues with appropriate fallbacks.


628-725: Implementation correctly handles all change types.

The extract_changes function properly:

  • Skips directory entries (line 644-646)
  • Extracts old_path for rename/copy operations (line 650-657)
  • Generates hunks for non-binary files including pure additions and deletions (line 686-690)
  • Applies conservative binary detection for modifications/rewrites (line 677)

The past review concerns about old_path extraction and hunk generation for additions/deletions have been addressed.


748-759: LGTM!

The generate_hunks function cleanly handles all cases (additions, deletions, modifications) by treating None content as empty strings. The gix diff internals with InternedInput and custom Sink implementation provide efficient diff computation.


884-1488: LGTM!

Excellent test coverage for generate_hunks using insta snapshots. The tests comprehensively cover:

  • Pure additions and deletions
  • Modifications and mixed operations
  • Edge cases (empty content, None inputs, whitespace changes)
  • Real-world code examples
  • Multiple diff algorithms (Myers, Histogram)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/git.rs (3)

77-124: Ignore pattern plumbing and tests look solid; only potential path-separator edge case.

init_ignore_patterns + should_exclude_file are wired cleanly with OnceLock<GlobSet>, and the tests cover lockfiles, minified/bundled assets, snapshots, and user patterns end‑to‑end. One small edge case: should_exclude_file splits on '/' to derive the filename, which is perfect for git-style paths (as used from change.location()), but would miss \ separators if this function ever started receiving raw OS paths on Windows. If that’s a possible call‑site in the future, consider normalizing to / or using Path/MAIN_SEPARATOR for the filename extraction; otherwise this is good as-is.

Also applies to: 761-881


245-329: DiffHunkCollector + generate_hunks + insta tests provide a solid, well-validated diff core.

The DiffHunkCollector implementation against gix::diff::blob::Sink is straightforward: 1‑based line numbering, deletions from before and additions from after only (no context lines), and one hunk per process_change call. generate_hunks wraps this neatly with InternedInput. The insta tests in mod generate_hunks exercise a wide range of scenarios (pure add/delete, insert/remove in the middle, mixed edits, whitespace, empty/None sides, full replace, and multiple algorithms), which makes the behavior very transparent and regression‑resistant. Given the MAX_BLOB_SIZE and MAX_CHANGE_LINES guards elsewhere, the per‑line to_string() allocations are acceptable; you only need to revisit this if you ever decide to support much larger diffs.

Also applies to: 747-758, 883-1487


628-745: Change extraction and binary/content guards look correct; you can optionally avoid double blob fetches for binaries.

The extract_changes logic is well structured:

  • Parent vs current tree selection is clear (empty tree for root commits).
  • Change::Addition/Deletion/Modification/Rewrite are mapped to old_id/new_id as expected, and FileStatus::from_change plus old_path from Rewrite { source_location, .. } preserves rename/copy information.
  • Binary/large detection via is_blob_binary (size > MAX_BLOB_SIZE or contains NUL) and get_blob_content (returning None for those cases) ensures that you don’t try to diff huge or obviously binary blobs, and you only feed text into the hunk generator.
  • Hunk generation is now correctly gated only on !is_binary, so additions and deletions also get hunks, and MAX_CHANGE_LINES applies to the count of actual changed lines (no context), which matches the animation goals.

One small optimization: for blobs already classified as binary/too large by is_blob_binary, you still call get_blob_content (which re‑fetches the blob and immediately returns Ok(None)), so each side is loaded twice. If you ever care about that extra I/O, you could skip content loading entirely when is_binary is true:

-                let old_content =
-                    old_id.and_then(|id| Self::get_blob_content(repo, id).ok().flatten());
-                let new_content =
-                    new_id.and_then(|id| Self::get_blob_content(repo, id).ok().flatten());
+                let (old_content, new_content) = if !is_binary {
+                    let old = old_id
+                        .and_then(|id| Self::get_blob_content(repo, id).ok().flatten());
+                    let new = new_id
+                        .and_then(|id| Self::get_blob_content(repo, id).ok().flatten());
+                    (old, new)
+                } else {
+                    (None, None)
+                };

Behavior stays the same (binary files still have no content/hunks), but large/binary blobs are fetched only once per side.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 86cb47e and 8ab4e74.

📒 Files selected for processing (1)
  • src/git.rs (12 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Byron
Repo: unhappychoice/gitlogue PR: 97
File: src/git.rs:464-475
Timestamp: 2025-11-24T19:06:43.768Z
Learning: In the gix library (as of version 0.75), the `with_hidden()` method on rev-walk Platform is broken and doesn't always work as expected, so it should be avoided. Use manual exclusion set workarounds instead when filtering commits during revision walks.
📚 Learning: 2025-11-24T19:06:43.768Z
Learnt from: Byron
Repo: unhappychoice/gitlogue PR: 97
File: src/git.rs:464-475
Timestamp: 2025-11-24T19:06:43.768Z
Learning: In the gix library (as of version 0.75), the `with_hidden()` method on rev-walk Platform is broken and doesn't always work as expected, so it should be avoided. Use manual exclusion set workarounds instead when filtering commits during revision walks.

Applied to files:

  • src/git.rs
🔇 Additional comments (3)
src/git.rs (3)

135-167: FileStatus::from_change cleanly mirrors gix Change variants.

The new FileStatus::from_change mapping handles Addition, Deletion, Modification, and both Rewrite { copy: false } (rename) and Rewrite { copy: true } (copy), which lines up with gix’s tree diff variants and keeps the status enum simple. Since the match is exhaustive over Change, any future additional variants will fail to compile, so this looks robust.


331-376: Commit navigation and range parsing look consistent; confirm that supported range syntaxes match your CLI expectations.

The switch to ObjectId in commit_cache/commit_range and the various next_*/random_* methods preserves the “non‑merge only” behavior and correctly derives ascending vs descending order from the underlying rev_walk. In parse_commit_range, the manual exclude_set built from rev_walk([start_oid]) and the main walk from rev_walk([end]) give you standard A..B semantics (“commits reachable from B but not from A”), and you then filter out merges in the same way as the cache-based playback. Given the known with_hidden() issues in gix, the explicit exclusion set is a reasonable workaround for now. Based on learnings, this is preferable to trying to use with_hidden() until it’s fixed.

One thing to double‑check is that the range strings your CLI actually produces are exactly the ones you intend to support here (e.g., HEAD~5..HEAD and abc123.. as in the error message), since an empty left side like "..HEAD" is treated as “all ancestors of HEAD” rather than a shorthand for HEAD..HEAD.

Also applies to: 388-516, 517-576, 578-601


603-626: Metadata extraction is clean; verify that using author time (not committer time) is intentional.

extract_metadata_with_changes decodes the commit once, pulls the author name via to_str_lossy, derives a UTC timestamp from author_sig.time()?.seconds, and falls back to Utc::now() only if from_timestamp can’t represent the value, which is defensive and non‑panicking. The only semantic question is that you now base date on the author’s time rather than on a separate “committer” or commit time; if the previous git2-based version used committer time, that’s a subtle behavior change in how commits are ordered/rendered. If author time is what you want for the screensaver, this implementation looks good—just worth confirming against prior behavior.

@unhappychoice
Copy link
Owner

Thank you for the PR! I agree with the direction of migrating from git2 to gix - the benefits of pure Rust implementation, better portability, and improved compile times are compelling.

However, since this is primarily a refactoring change, I'd like to prioritize other pending bug fixes and feature requests first. I'll revisit this PR once those are addressed.

Thanks for your patience and willingness to help if questions arise! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants