Skip to content

Conversation

@Rupikz
Copy link
Contributor

@Rupikz Rupikz commented Jul 1, 2025

Description

Hey, I have a little idea for implementing recursive scanning of nested archives.
I would appreciate it if you could share your vision for implementing this feature and advise me on the direction I
should take next. Would this even work in theory?

Algorithm:

  • Unarchive
    • After indexing root directory, all found archives are extracted into a temp directory
    • All extracted archives indexing again
    • If additional archives are found, repeat the step above
  • Search
    • FiletreeResolver searches through the main index and traverses all archive indices
    • If a file is found by path/glob... in an archive, its realPath/accessPath are substituted as follows:
      • location.realPath - path to archive in the given path (example: /rootPath/archive.zip)
      • location.accessPath - virtual path to the file with nested archives (example:
        /rootPath/archive.zip/nestedArchive.tar/test.txt)
      • location.reference.realPath - absolute physical path to the file in the temporary archive directory (example:
        /temp/test.txt)

TODOs:

  • Update catalogers: deb-archive-cataloger, java-archive-cataloger, generic-cataloger...
  • Add recursive extract to ImageSource?
  • Now location.RealPath not work for archives in catalogers - fix
  • Add config param of exclude extensions and other...
  • Add more tests

Relative issues:

Type of change

  • Breaking change (please discuss with the team first; Syft is 1.0 software and we won't accept breaking changes without going to 2.0)

@spiffcs
Copy link
Contributor

spiffcs commented Jul 1, 2025

👋 Hey @Rupikz if you're seeing errors in testing on CI we just had to bust and refresh our test fixture cache here:
#4042

If you just pull in the changes from main and rerun you should see the seemingly random test cases fixed with the latest data.

Signed-off-by: Kudryavcev Nikolay <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants