Skip to content

Conversation

@samobermiller
Copy link
Collaborator

@samobermiller samobermiller commented Dec 5, 2025

Closes microbiomedata/issues#1496

Github API limits the number of issues being pulled at a time to 100. This was causing the existing github issue search to fail to find matches for older submissions and create duplicate issues (instead of commenting on the existing ones). This PR updates the issue search function, so it pages through up to 2000 issues looking for matching submission ids.

Note for future:
We will soon reaching that 2000 issue limit in the issues repo. Since we don't want to limit the search to only open issues and since searching more than 2000 issues will slow down the Submit functionality, we may want to consider:

  • figuring out a way to tag the NMDC submission issues (separating them from just general issues people create) and only searching those
  • stopping the search as soon as we hit the first match (a single match should be the case anyway at this point) but this might not solve the problem for older submissions

Copy link
Collaborator

@JamesTessmer JamesTessmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a little surprised at how long it took to create the issue due to the multi page querying, have you tried other values for the issues per page or total number of pages?
Do you know if the number of issues per page is locked, I saw there's a variable for it in the params. It might be a little more efficient to do fewer queries in terms of page number, but ask for more issues per page.

That might not be an issue we can really solve in this PR.

@samobermiller
Copy link
Collaborator Author

I was a little surprised at how long it took to create the issue due to the multi page querying, have you tried other values for the issues per page or total number of pages? Do you know if the number of issues per page is locked, I saw there's a variable for it in the params. It might be a little more efficient to do fewer queries in terms of page number, but ask for more issues per page.

Github API only allows a max of 100 issues per page unfortunately. We can look into tags or one of the other options i mentioned above

@JamesTessmer
Copy link
Collaborator

I was a little surprised at how long it took to create the issue due to the multi page querying, have you tried other values for the issues per page or total number of pages? Do you know if the number of issues per page is locked, I saw there's a variable for it in the params. It might be a little more efficient to do fewer queries in terms of page number, but ask for more issues per page.

Github API only allows a max of 100 issues per page unfortunately. We can look into tags or one of the other options i mentioned above

Gotcha, I like the tag idea you mentioned. Somehow I completely missed those 2 bullets in the PR text oops. I saw you also commented on stopping search early if it finds a match, so I think we're in agreement on that point from my other comment :)

@JamesTessmer
Copy link
Collaborator

I think if you remove the extra loop I'm happy to call it good for now and approve, and we can revisit the tagging idea in the future since that seems like an issue that might need a little more discussion to solve.

Copy link
Collaborator

@JamesTessmer JamesTessmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Collaborator

@pkalita-lbl pkalita-lbl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine for now, but really we should not be searching for GitHub issues at all. See #1821 and the point 1 in #1821 (comment)

@JamesTessmer
Copy link
Collaborator

This seems fine for now, but really we should not be searching for GitHub issues at all. See #1821 and the point 1 in #1821 (comment)

I don't think I have seen that issue, but I do like the idea of storing the associated github issue as part of the metadata.

@pkalita-lbl pkalita-lbl changed the title 1496 issue creation comment on re submission isnt working Iterate over GitHub issues list when finding existing submission issue Dec 8, 2025
@samobermiller
Copy link
Collaborator Author

samobermiller commented Dec 8, 2025

I've seen that issue before but completely forgot about it, thank you Patrick! If it's alright with both of you, I will merge in this temporary fix and assign myself to that issue you referenced Patrick? That way something goes out with this release to catch the instances of older github submissions, but the task of switching away from an API search doesn't go off my radar? Or maybe I'm too late for this release @pkalita-lbl @JamesTessmer

@pkalita-lbl
Copy link
Collaborator

Yep that works for me

@JamesTessmer
Copy link
Collaborator

Sounds like a good plan

@samobermiller samobermiller merged commit 48b6d4b into main Dec 9, 2025
2 checks passed
@samobermiller samobermiller deleted the 1496-issue-creation-comment-on-re-submission-isnt-working branch December 9, 2025 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Issue creation & comment on re-submission isn't working

4 participants