Skip to content

Conversation

@anurag2787
Copy link
Contributor

@anurag2787 anurag2787 commented Nov 2, 2025

Resolves #1988

Description

This PR improves how we generate snapshots by making the handling of active entities more consistent. Earlier, different parts of the code used different ways to filter chapters by is_active, and some didn’t properly handle cases where repositories were empty
Now, we’ve standardized the process using the manager patterns active_chapters active_projects active_releases

Checklist

  • I've read and followed the contributing guidelines.
  • I've run make check-test locally; all checks and tests passed.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 2, 2025

Summary by CodeRabbit

  • Refactor
    • Improved backend data retrieval efficiency for release filtering and processing.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds an ActiveReleaseManager that filters out drafts, pre-releases, and empty repositories and attaches it to the Release model; updates the owasp snapshot command to use active managers for chapters, projects, and releases with explicit created_at time bounds.

Changes

Cohort / File(s) Summary
Release manager & model
backend/apps/github/models/managers/release.py, backend/apps/github/models/release.py
Adds ActiveReleaseManager (overrides get_queryset to select_related("repository") and filter is_draft=False, is_pre_release=False, repository__is_empty=False) and exposes it as Release.active_releases alongside default Release.objects.
Snapshot processing command
backend/apps/owasp/management/commands/owasp_process_snapshots.py
Replaces get_new_items(...).filter(is_active=True) for Chapters, Projects, Releases with direct use of Chapter.active_chapters, Project.active_projects, Release.active_releases and applies explicit created_at__gte=start_at / created_at__lte=end_at filters.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Verify ActiveReleaseManager.get_queryset logic and select_related usage.
  • Confirm Release.active_releases is used consistently and that time bounds in the snapshot command match expectations.
  • Check for any tests or callsites relying on previous is_active filtering semantics.

Suggested reviewers

  • arkid15r
  • kasya

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Improved Snapshot Generation' accurately summarizes the main change—standardizing snapshot generation by using manager patterns for active entities.
Linked Issues check ✅ Passed The PR addresses issue #1988 requirements by creating ActiveReleaseManager for releases and updating owasp_process_snapshots.py to use active_chapters, active_projects, and active_releases managers instead of custom filters.
Out of Scope Changes check ✅ Passed All changes are within scope: new ActiveReleaseManager aligns with issue requirements, Release model managers enable the described pattern, and snapshot processing updates use the new managers as intended.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The PR description clearly relates to the changeset, describing the standardization of active entity filtering using manager patterns.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Nov 2, 2025

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
backend/apps/github/models/managers/release.py (1)

6-23: LGTM! Clean implementation of active release filtering.

The manager correctly filters out drafts, pre-releases, and empty repositories. The use of select_related("repository") is a good performance optimization since we're filtering on repository fields.

Consider enhancing the docstring to be more explicit about the filtering criteria:

 class ActiveReleaseManager(models.Manager):
-    """Active releases manager."""
+    """Manager for active releases.
+    
+    Returns releases that are published (not drafts), stable (not pre-releases),
+    and have a non-empty repository.
+    """
 
     def get_queryset(self):
-        """Get queryset of active releases.
-        
-        Filters out draft and pre-releases, and ensures repository exists.
-        """
+        """Get queryset of active releases.
+        
+        Returns:
+            QuerySet: Releases filtered by:
+                - is_draft=False
+                - is_pre_release=False  
+                - repository__is_empty=False (also excludes null repositories)
+        """

Note: The repository__is_empty=False filter will also exclude releases where repository is NULL. If releases without repositories should be included, you'll need to adjust the filter logic.

backend/apps/owasp/management/commands/owasp_process_snapshots.py (1)

69-85: LGTM! Excellent consistency improvement using active entity managers.

The changes successfully achieve the PR objectives by:

  • Replacing generic is_active filtering with specialized active managers
  • Using consistent patterns across all three entity types (chapters, projects, releases)
  • Maintaining proper time boundaries with created_at__gte and created_at__lte

The approach ensures that:

  1. Chapter.active_chapters filters active chapters
  2. Project.active_projects filters active projects
  3. Release.active_releases filters non-draft, non-prerelease releases from non-empty repositories

All three now apply time boundaries uniformly, eliminating the previous inconsistencies mentioned in issue #1988.

Optional consideration: Releases are filtered by created_at, which is consistent with chapters and projects. However, releases semantically have a published_at timestamp that might be more meaningful for snapshot purposes. The current approach (using created_at) is defensible and maintains consistency, but if you want to capture releases by their publication date instead, you'd need to adjust the filter and handle potential null published_at values.

Given that the goal is consistency and created_at is guaranteed to be non-null, the current implementation is solid.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f86ed08 and f8d2d62.

📒 Files selected for processing (3)
  • backend/apps/github/models/managers/release.py (1 hunks)
  • backend/apps/github/models/release.py (1 hunks)
  • backend/apps/owasp/management/commands/owasp_process_snapshots.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 1967
File: backend/apps/owasp/api/internal/filters/project_health_metrics.py:24-27
Timestamp: 2025-08-04T15:55:08.017Z
Learning: In the OWASP Nest project, the `is_active` filter in `ProjectHealthMetricsFilter` is intentionally designed as a workaround to eliminate inactive projects from the dashboard. It only filters FOR active projects when `value=True`, and returns an empty Q() when `value=False` to avoid showing inactive projects. This is not meant to be a general boolean filter but a specific solution to exclude inactive projects from the project health metrics dashboard.
🧬 Code graph analysis (2)
backend/apps/github/models/release.py (4)
backend/apps/github/models/managers/release.py (1)
  • ActiveReleaseManager (6-23)
backend/apps/github/models/mixins/release.py (1)
  • ReleaseIndexMixin (8-70)
backend/apps/common/models.py (2)
  • BulkSaveModel (10-34)
  • TimestampedModel (37-46)
backend/apps/github/models/common.py (1)
  • NodeModel (76-90)
backend/apps/owasp/management/commands/owasp_process_snapshots.py (5)
backend/apps/owasp/api/internal/nodes/snapshot.py (4)
  • new_chapters (35-37)
  • new_issues (40-42)
  • new_projects (45-47)
  • new_releases (50-52)
backend/apps/owasp/models/chapter.py (1)
  • Chapter (21-217)
backend/apps/owasp/api/internal/queries/snapshot.py (1)
  • snapshot (14-22)
backend/apps/owasp/models/project.py (1)
  • Project (37-384)
backend/apps/github/models/release.py (1)
  • Release (11-139)
🔇 Additional comments (3)
backend/apps/github/models/release.py (2)

7-7: LGTM! Proper import for the new manager.

The import is correctly placed with other model-related imports.


14-16: LGTM! Manager declarations follow Django best practices.

Explicitly defining both the default objects manager and the specialized active_releases manager is the correct approach. This matches the pattern used by Chapter and Project models, ensuring consistency across the codebase.

The order matters in Django—the first manager declared becomes the default manager, so keeping objects first maintains expected ORM behavior.

backend/apps/owasp/management/commands/owasp_process_snapshots.py (1)

73-90: LGTM! Correct to keep Issues and Users using the generic helper.

Issue and User models don't have specialized active managers (and likely don't need them based on their domain semantics), so continuing to use get_new_items for these entities is appropriate. This selective application of active managers demonstrates good judgment—only entities with meaningful "active" states get the specialized filtering.

@anurag2787
Copy link
Contributor Author

@arkid15r for new_issue if we use OpenIssueManager that is predefined but it only returns issues from the last 90 days so So instead, can we directly use this ?

new_issues = Issue.objects.filter(
                created_at__gte=snapshot.start_at,
                created_at__lte=snapshot.end_at,
                state="open",
                repository__project__isnull=False,
            ).select_related("repository")

@anurag2787 anurag2787 marked this pull request as ready for review November 2, 2025 09:35
@ahmedxgouda ahmedxgouda self-assigned this Nov 17, 2025
@ahmedxgouda ahmedxgouda self-requested a review November 17, 2025 13:39
Copy link
Collaborator

@ahmedxgouda ahmedxgouda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please run make check-test first

Image

@anurag2787 anurag2787 marked this pull request as draft November 21, 2025 16:51
@anurag2787 anurag2787 marked this pull request as ready for review December 6, 2025 12:47
@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 6, 2025

@anurag2787 anurag2787 marked this pull request as draft December 6, 2025 12:56
@anurag2787 anurag2787 marked this pull request as ready for review December 6, 2025 13:18
@arkid15r
Copy link
Collaborator

@arkid15r for new_issue if we use OpenIssueManager that is predefined but it only returns issues from the last 90 days so So instead, can we directly use this ?

new_issues = Issue.objects.filter(
                created_at__gte=snapshot.start_at,
                created_at__lte=snapshot.end_at,
                state="open",
                repository__project__isnull=False,
            ).select_related("repository")

The current naming/implementation is bad. Open issues manager should filter by state only. The dates should be handled separately by .recent_issues() or similar manager.
We can add it as you suggested but the issue managers need to be improved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve snapshot generation

3 participants