Skip to content

[Bug][GitHub] Merged PRs incorrectly stored as closed with merged = false since March 2025 #8571

@jcg-juan

Description

@jcg-juan

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

We’ve identified a data inconsistency with how DevLake ingests GitHub Pull Requests. Since around March 2025, all PRs that are merged in GitHub are being stored in DevLake as state = closed and merged = false. This results in:

  • Merged PRs not being reflected correctly in queries (state = 'MERGED' returns nothing after February).
  • Metrics such as additions/deletions for merged PRs show up as 0.
  • Dashboards and reports undercounting merged PR activity.

Impact:

  • Any downstream metrics that rely on merged PRs (e.g., code change size, lead time, DORA metrics) are currently inaccurate.
  • Dashboards show zero additions/deletions after Feb 2025.

What do you expect to happen

  • DevLake should correctly recognise merged PRs (state = 'MERGED') or, at a minimum, set the merged column to true when GitHub API reports it.
  • Additions/deletions should be populated for merged PRs the same way as open ones.

How to reproduce

Steps to Reproduce:

  • Ingest PR data from GitHub into DevLake.
  • Run the following query to check the latest PRs:
SELECT *
FROM public._tool_github_pull_requests
ORDER BY github_updated_at DESC
LIMIT 1000;
  • Observe that since March 2025, all PRs are either open or closed, never MERGED.
  • Additionally, merged column is always false, even when GitHub API reports merged = true.

Anything else

  1. Example GitHub API response for a merged PR (PR Welcome to Dev Lake Discussions! #1 before March working as expected):
curl -s -H "Accept: application/vnd.github+json" \
   -H "Authorization: Bearer <token>" \
   https://api.github.com/repos/<org>/<repo>/pulls/1 \
   | jq '{number, state, merged, additions, deletions}'

Response:

{
  "number": 1,
  "state": "closed",
  "merged": true,
  "additions": 203,
  "deletions": 0
}

From Devlake DB (table _tool_github_pull_requests): 
connection_id | github_id | repo_id | number | state | title | created_at | updated_at | closed_at | additions | deletions | changed_files | review_comments | comments | merged | merge_commit_sha | head_ref | base_ref | head_sha | base_sha | url | author | author_id | _created_at | _updated_at | raw_data_origin | raw_table | raw_data_id | org_id | installation_id | commits | is_draft -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- 1 | 2019299654 | 842380603 | 1 | MERGED | initial setup | 2024-08-14 13:22:49 | 2024-08-14 13:26:32 | 2024-08-14 13:26:31 | 203 | 0 | 0 | 0 | 0 | false | (redacted) | setup-action | main | (redacted) | (redacted) | (redacted) | dev-user | 132356641 | 2025-03-30 00:09:11.131044 | 2025-03-30 00:09:11.131044 | {"ConnectionId":1,"Name":"…"} | _raw_github_graphql_prs | 171 | (null) | 0 | 0 | false
  1. Example GitHub API response for a merged PR (PR Gitlab mr collection #73 after March NOT working as expected):
curl -s -H "Accept: application/vnd.github+json" \
   -H "Authorization: Bearer <token>" \
   https://api.github.com/repos/<org>/<repo>/pulls/73\
   | jq '{number, state, merged, additions, deletions}'

Response:

{
  "number": 73,
  "state": "closed",
  "merged": true,
  "additions": 4,
  "deletions": 1
}

From Devlake DB (table _tool_github_pull_requests):

connection_id | github_id | repo_id | number | state | title | created_at | updated_at | closed_at | additions | deletions | changed_files | review_comments | comments | merged | merge_commit_sha | head_ref | base_ref | head_sha | base_sha | url | author | author_id | _created_at | _updated_at | raw_data_origin | raw_table | raw_data_id | org_id | installation_id | commits | is_draft -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- 2 | 2815676617 | 842380603 | 73 | closed | Cicd usage script fix: check if action enabled | 2025-09-10 14:09:57 | 2025-09-10 16:05:53 | 2025-09-10 16:05:52 | 0 | 0 | 0 | 0 | 0 | false | (redacted) | feature-branch | main | (redacted) | (redacted) | (redacted) | dev-user | 131661649 | 2025-09-12 00:00:49.359316 | 2025-09-12 00:00:49.359316 | {"ConnectionId":2,"Name":"…"} | _raw_github_api_pull_requests | 30757 | (null) | 842380603 | 0 | false

Version

v1.0.2

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    component/pluginsThis issue or PR relates to pluginsseverity/p0This bug blocks key user journey and functiontype/bugThis issue is a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions