Skip to content

feat(indexing): respect root .gitignore patterns during indexing#607

Merged
vitali87 merged 2 commits into
mainfrom
feat/respect-gitignore
Jul 4, 2026
Merged

feat(indexing): respect root .gitignore patterns during indexing#607
vitali87 merged 2 commits into
mainfrom
feat/respect-gitignore

Conversation

@vitali87

@vitali87 vitali87 commented Jul 4, 2026

Copy link
Copy Markdown
Owner

Summary

Dogfooding dead-code on cgr's own repo indexed evals/results/l3_workspace -- a GITIGNORED directory of generated eval fixtures -- and reported 30 of its symbols as dead code. cgr honors .cgrignore and --exclude (#495) but never read .gitignore, so build artifacts and generated output pollute the graph for every user.

  • load_ignore_patterns(repo_path): merges root .gitignore into the exclude/unignore set alongside .cgrignore (shared _load_ignore_file parser; both files use gitwildmatch semantics, ! negations map to unignores).
  • .cgrignore remains the override channel: a !pattern there re-includes something .gitignore excludes (indexing generated code on purpose).
  • All three indexing entry points (start, index, MCP) switch to the merged loader.
  • Root .gitignore only; nested .gitignore files are out of scope until a real repo needs them.

Tests (RED -> GREEN)

4 tests in test_gitignore_patterns.py: gitignore excludes loaded, ! negations map to unignores, cgrignore+gitignore merge with override, empty default. Existing cgrignore/CLI tests updated to the new patch target.

Full suite: 4408 passed, 14 skipped. ruff + format clean; ty at the pre-existing baseline.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for loading and merging .gitignore patterns alongside .cgrignore patterns to prevent indexing build artifacts and generated files. It refactors the loading logic in config.py and updates references across the CLI, main module, and tests, while adding a new test suite for gitignore patterns. The review feedback highlights an issue with the merging logic: using a simple union (|) fails to correctly respect .cgrignore as the authoritative override channel when conflicting exclude/unignore patterns exist. Actionable suggestions were provided to adjust the merge logic in config.py and update the test assertions accordingly.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread codebase_rag/config.py
Comment thread codebase_rag/tests/test_gitignore_patterns.py Outdated
@greptile-apps

greptile-apps Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds root .gitignore support to the indexing ignore pipeline. The main changes are:

  • Merges root .gitignore patterns with .cgrignore patterns.
  • Switches CLI indexing entry points to the merged ignore loader.
  • Updates interactive exclude prompting to use merged ignore patterns.
  • Adds tests for .gitignore excludes, negations, and .cgrignore override behavior.

Confidence Score: 4/5

The change is narrowly scoped, but the ignore-pattern merge can mis-handle ordered negation cases and index files that Git would ignore.

The review is based on the touched ignore-loading and indexing call paths plus the new tests around .gitignore and .cgrignore behavior.

codebase_rag/config.py

T-Rex T-Rex Logs

What T-Rex did

  • Created a focused pytest repro harness for gitignore order preservation.
  • Tried running the harness with the system Python, but the run was blocked by Python 3.12 syntax incompatibilities.
  • Tried a full editable install to prepare the environment, but the attempt failed due to missing runtime dependencies and a cmake-related pymgclient build issue.
  • Compared the base and head ignore-loader results; the head run includes the root .gitignore and shows updated exclude/unignore patterns.
  • Validated the runtime behavior of the graph-sync CLI entry points; the base run produced only cli/ and local/, while the head run produced cli/, generated/, and local/, confirming that root .gitignore is considered in the head while CLI and .cgrignore semantics are preserved.

View all artifacts

T-Rex Ran code and verified through T-Rex

Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
codebase_rag/config.py:421-424
**Preserve gitignore order**
`load_ignore_patterns()` collapses `.gitignore` into sets, so `dist/\n!dist/\ndist/` returns `dist/` only in `unignore` and indexes `dist/`, while Git's documented rule is that within one precedence level the last matching pattern decides the outcome. Repos that re-ignore after a negation will now include generated files that Git ignores; the loader needs to preserve pattern order or compile an ordered `PathSpec` for `.gitignore` instead of subtracting all negations globally.

### Agentic Framework
-... ([source](https://app.greptile.com/graph-code/github/vitali87/code-graph-rag/-/custom-context?memory=d4240b05-b763-467a-a6bf-94f73e8b6859))

Reviews (3): Last reviewed commit: "fix(indexing): cancel gitignore excludes..." | Re-trigger Greptile

Comment thread codebase_rag/config.py
@codecov-commenter

codecov-commenter commented Jul 4, 2026

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@vitali87

vitali87 commented Jul 4, 2026

Copy link
Copy Markdown
Owner Author

@greptile review

@vitali87 vitali87 merged commit ddbb81b into main Jul 4, 2026
19 of 20 checks passed
@vitali87

vitali87 commented Jul 4, 2026

Copy link
Copy Markdown
Owner Author

@greptile review

Comment thread codebase_rag/config.py
Comment on lines +421 to +424
negations = cgr.unignore | git.unignore
return CgrignorePatterns(
exclude=cgr.exclude | (git.exclude - negations),
unignore=negations,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Preserve gitignore order
load_ignore_patterns() collapses .gitignore into sets, so dist/\n!dist/\ndist/ returns dist/ only in unignore and indexes dist/, while Git's documented rule is that within one precedence level the last matching pattern decides the outcome. Repos that re-ignore after a negation will now include generated files that Git ignores; the loader needs to preserve pattern order or compile an ordered PathSpec for .gitignore instead of subtracting all negations globally.

Rule Used: ## Technical Requirements

Agentic Framework

-... (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: codebase_rag/config.py
Line: 421-424

Comment:
**Preserve gitignore order**
`load_ignore_patterns()` collapses `.gitignore` into sets, so `dist/\n!dist/\ndist/` returns `dist/` only in `unignore` and indexes `dist/`, while Git's documented rule is that within one precedence level the last matching pattern decides the outcome. Repos that re-ignore after a negation will now include generated files that Git ignores; the loader needs to preserve pattern order or compile an ordered `PathSpec` for `.gitignore` instead of subtracting all negations globally.

**Rule Used:** ## Technical Requirements

### Agentic Framework
-... ([source](https://app.greptile.com/graph-code/github/vitali87/code-graph-rag/-/custom-context?memory=d4240b05-b763-467a-a6bf-94f73e8b6859))

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants