Skip to content

feat(scanner): Optimize creation of file archives#11483

Open
oheger-bosch wants to merge 1 commit into
oss-review-toolkit:mainfrom
boschglobal:oheger-bosch/scanner_optimize_file_archives
Open

feat(scanner): Optimize creation of file archives#11483
oheger-bosch wants to merge 1 commit into
oss-review-toolkit:mainfrom
boschglobal:oheger-bosch/scanner_optimize_file_archives

Conversation

@oheger-bosch

Copy link
Copy Markdown
Member

When processing provenances with multiple packages, for each package, the provenance was downloaded to create the file archive. Prevent this by grouping the packages by provenance and do only a single download. This can have a significant effect for large repositories containing many submodules.

@codecov

codecov Bot commented Feb 24, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.43%. Comparing base (b83a940) to head (ac9d4be).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##               main   #11483   +/-   ##
=========================================
  Coverage     58.43%   58.43%           
  Complexity     1807     1807           
=========================================
  Files           361      361           
  Lines         13499    13499           
  Branches       1383     1383           
=========================================
  Hits           7888     7888           
  Misses         5115     5115           
  Partials        496      496           
Flag Coverage Δ
funTest-external-tools 14.64% <ø> (ø)
test-windows-2025 41.76% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sergej-koscejev

Copy link
Copy Markdown
Contributor

I tested this locally because I'm having a similar problem, but ort download ... --archive-all produced exactly the same archive as before, with multiple redundant copies of the same repository inside. Is this something that this PR was supposed to address (which would mean I probably made a mistake in my test) or was this out of scope?

@oheger-bosch

Copy link
Copy Markdown
Member Author

This PR does not change the downloader, only the Scanner. So, it is expected that the behavior does not change.

@oheger-bosch oheger-bosch force-pushed the oheger-bosch/scanner_optimize_file_archives branch 2 times, most recently from 650b3c2 to 5b54591 Compare June 8, 2026 06:55
@oheger-bosch oheger-bosch marked this pull request as ready for review June 8, 2026 07:01
@oheger-bosch oheger-bosch requested a review from a team as a code owner June 8, 2026 07:01

@sschuberth sschuberth left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides nits.

Comment thread scanner/src/main/kotlin/Scanner.kt Outdated
Comment thread scanner/src/main/kotlin/Scanner.kt Outdated
A file archive is associated with a provenance. When processing
provenances with multiple packages, the archive was created repeatedly
for each package.

Prevent this by grouping the packages by provenance and create only a
single archive per provenance. For large repositories containing many
submodules, this can significantly reduce the processing time of the
scanner.

Resolves oss-review-toolkit#11484.

Signed-off-by: Oliver Heger <oliver.heger@bosch.com>
@oheger-bosch oheger-bosch force-pushed the oheger-bosch/scanner_optimize_file_archives branch from 5b54591 to ac9d4be Compare June 8, 2026 10:35
@oheger-bosch oheger-bosch requested a review from sschuberth June 8, 2026 10:36
@sschuberth sschuberth enabled auto-merge (rebase) June 8, 2026 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants