Skip to content

fix: properly clean up pygit2 Repository objects to prevent symlink/fd accumulation (#634)#881

Open
janisag07 wants to merge 1 commit intopermitio:masterfrom
janisag07:fix/cleanup-symlinks-634
Open

fix: properly clean up pygit2 Repository objects to prevent symlink/fd accumulation (#634)#881
janisag07 wants to merge 1 commit intopermitio:masterfrom
janisag07:fix/cleanup-symlinks-634

Conversation

@janisag07
Copy link
Copy Markdown

Problem

When the OPAL server has GitHub policy sources configured and GitHub becomes unavailable, repo.remotes["origin"].fetch() raises pygit2.GitError. The Repository objects remain in the class-level GitPolicyFetcher.repos cache, holding native C-level file descriptors open. Each failed fetch attempt accumulates FDs that appear as symbolic links in /proc/{pid}/fd/, making the server look like it has spawned zombie processes.

Root Cause

The existing code never calls repo.free() on cached pygit2.Repository objects when errors occur. Simply deleting references from the dict is insufficient because Python's garbage collector timing is unpredictable — the C-level resources may not be released promptly.

Fix

Added a _cleanup_repo_from_cache() class method that:

  1. Pops the repository from GitPolicyFetcher.repos
  2. Calls repo.free() to immediately release native C resources
  3. Handles errors gracefully during free()

Applied cleanup at 4 failure points:

Why this PR is different

Tests

7 new tests covering cache cleanup, idempotency, error handling, and all failure paths.

/claim #634

…res (permitio#634)

When the git remote is unavailable, pygit2 Repository objects cached in
GitPolicyFetcher.repos keep file descriptors open.  Repeated failed
fetches/clones accumulate these descriptors, which show up as symbolic
links in /proc and look like zombie processes.

Changes:
- Add _cleanup_repo_from_cache() that pops the Repository from the
  class-level cache and calls repo.free() to release native resources.
- Call cleanup on fetch failure (the main scenario: repo already cloned
  but remote goes down).
- Call cleanup on clone failure and remove partial clone directory.
- Call cleanup when an existing repo is detected as invalid.
- Add unit tests covering all cleanup paths.
@netlify
Copy link
Copy Markdown

netlify Bot commented Feb 16, 2026

Deploy Preview for opal-docs canceled.

Name Link
🔨 Latest commit 3139f3b
🔍 Latest deploy log https://app.netlify.com/projects/opal-docs/deploys/69928d6933bbde00087c1b76

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant