-
-
Notifications
You must be signed in to change notification settings - Fork 54
Description
Overview
Explore ways to identify existing related issues when creating new ones because duplicate/unrelated issues lead to PRs closing multiple issues and make tracking unmanageable.
Details
A common problem in growing projects is that older issues stay open or unresolved while new issues are created about the same or closely related problems. This leads to duplicate or overlapping issues, unclear scopes, and contributors unknowingly creating redundant work. Downstream, this often results in a single PR closing multiple issues (sometimes unintentionally), which makes acceptance criteria and project metrics harder to interpret and increases PM review burden.
In many large open-source and enterprise projects, teams adopt explicit strategies to reduce duplication and improve issue quality:
- Duplicate detection features in GitHub and GitLab: GitHub recently added ability to close an issue as a duplicate of another to better manage duplicated reports and clarify history. ([The GitHub Blog]1) GitLab provides “related issue” and “blocks/blocked by” linking, which helps make dependencies and overlaps explicit. ([The GitLab Handbook]2)
- Issue automation and bots: Some projects tag all new issues with triage labels and use automation to warn about inactivity or potential duplicates, reducing noise and helping maintainers focus. ([FAST]3)
- Smart search and templates: Encouraging contributors to search before creating a new issue — and using templates that steer them to reference existing issues — can reduce duplicates. Robust search helps surface existing similar issues. ([GeeksforGeeks]4)
- Research on clustering and similarity: Academic work shows machine-learning could cluster similar issues or suggest existing related bugs, which points toward future tooling possibilities. ([ScienceDirect]5)
In system design terms here, we are talking about thinking of issue tracking and PR pairing as a workflow system: you have inputs (new issue creation), processing (search, tagging, similarity detection), and outputs (issues created, flagged, or linked). Good system design considers how to expose information early, reduce human error, and scale support as contributors increase.
Action Items
[PLEASE ADD ACTION ITEMS (Dev lead/developer will add)]
Resources
- FOLA: 2026 PM/Dev Meeting Agenda #2612 (comment) Bug: Addresses with abbreviations like 601 S. San Pedro St don't resolve to an address when it should #2481 was so confusing
- GitHub’s close as duplicate and improved search/filters for issue management features. ([The GitHub Blog]1)
- GitLab documented guidelines for related/duplicate issues and use of templates/labels. ([The GitLab Handbook]2)
- FAST design community practices on automation for issue triage and duplicate closure. ([FAST]3)
- Academic research on clustering similar issues, showing ML-based or clustering approaches that could be part of a future issue clustering or suggestion system. ([ScienceDirect]5)
example workflow diagrams or pseudo-automation scripts to illustrate what a detection system might look like in practice.