Component version to GitHub tag matching.

I have been experiencing issues in `bom findsources` with capycli's GitHub interaction. Jobs take unexpectedly long and the memory consumption is correspondingly high (but isn't an issue in itself). I use capycli to process relatively large BOMs and, according to capycli's findings, I frequently have 400-500 third party components from GitHub.

I tracked these issues to how findsources maps component versions to tags on GitHub. Currently, capycli first retrieves the full list of a project's tags (`get_github_info()` in capycli.bom.findsources) and then iterates over this list, hoping to find a match to the version provided as a parameter to `get_matching_tag()`. 

There are projects like [the tencentcloud sdk](https://github.com/tencentcloud/tencentcloud-sdk-go) with tens of thousands of tags. Using the GitHub API, capycli has to retrieve these at chunks of 100 tags per call using Python's synchronous IO.

On average, `get_matching_tag()` does 109 negative comparisons for each tag it matches. This means on average in my use cases capycli has to fetch two pages worth of tags to match a component. This is amounts to retrieving tencentcloud sdk alone.

As far as I can tell, ...
* `get_github_info()` is only ever used twice with both occurrences in `capycli.bom.findsources`. Both uses virtually directly feed into `get_matching_tag()`.
* `get_matching_tag()` is only ever used three times with all occurrences in `capycli.bomfindsources`. All uses are essentially immediately `return`-ed

Are there any uses of these methods I missed?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Component version to GitHub tag matching. #99

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Component version to GitHub tag matching. #99

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions