-
Notifications
You must be signed in to change notification settings - Fork 151
Description
Bug Report: Incorrect Repository Filtering and Fork Data Handling
Description
The current code incorrectly skips repositories with forks_count < 5, including valid open-source project details.
This issue occurs because the script uses the child fork’s GitHub details (stars, forks, issues) instead of fetching data from the upstream parent repository.
As a result, certain repositories appear to have incomplete or incorrect metadata, leading to false negatives in open-source contribution evaluations.
Files Affected
github.py
Impact
This bug directly affects the open-source project evaluation metrics, as the Jinja prompt templates rely on:
starsforks_countopen_issues
Projects that are forks of active upstream repositories get undervalued or skipped entirely.
Comparison: Current vs Expected Behavior
Expected Output
{
"name": "Indiekart",
"description": null,
"github_url": "https://github.com/trishanu-init/Indiekart",
"live_url": "https://indiekart.vercel.app/",
"technologies": ["TypeScript"],
"project_type": "open_source",
"contributor_count": 27,
"author_commit_count": 97,
"total_commit_count": 174,
"github_details": {
"forked_from": "https://github.com/Indie-Kart/ecommerce-store",
"parent_full_name": "Indie-Kart/ecommerce-store",
"stars": 34,
"forks": 63,
"language": "TypeScript",
"description": "Repo Owner: Trishanu Nayak",
"topics": ["gssoc", "gssoc24"],
"open_issues": 88,
"created_at": "2024-02-22T08:48:17Z",
"updated_at": "2025-09-02T18:38:35Z",
"size": 3029,
"fork": true,
"archived": false,
"default_branch": "main"
}
}Current Output
{
"name": "Indiekart",
"description": null,
"github_url": "https://github.com/trishanu-init/Indiekart",
"live_url": "https://indiekart.vercel.app/",
"technologies": ["TypeScript"],
"project_type": "open_source",
"contributor_count": 27,
"author_commit_count": 97,
"total_commit_count": 174,
"github_details": {
"stars": 0,
"forks": 0,
"language": "TypeScript",
"description": null,
"created_at": "2024-06-22T08:00:36Z",
"updated_at": "2024-07-29T11:07:24Z",
"topics": [],
"open_issues": 0,
"size": 3029,
"fork": true,
"archived": false,
"default_branch": "main",
"contributors": 27
}
}Root Cause
The logic currently skips any repository where:
if repo.get("fork") and repo.get("forks_count", 0) < 5:
continueThis leads to premature exclusion of legitimate projects
Proposed Solution
-
Update the
forks_countlogic
Avoid skipping forks based solely on count. -
Fetch details from the parent repository
If the project is a fork, use the parent repository's data: -
Add a fallback mechanism
Use parent data if available.
If the parent is not found (API error or missing data), fall back to the child repository details.