Skip to content

Scanning projects should not claim the same file for different projects (of the same type) #5365

@sschuberth

Description

@sschuberth

Currently, scanning in ORT is package-based, and project-packages are identified by "definition files" (like "pom.xml", "build.gradle", "package.json" etc.) in the directory tree. So all files and directories below a definition file are regarded as belonging to the project that the definition file defines. See:

ROOTDIR
|
+-SUBPROJDIR_A
| |
| +-pom-a.xml
| |
| +-license-a.txt
|
+-SUBPROJDIR_B
| |
| +-pom-b.xml
| |
| +-license-b.txt
|
+-WEBDIR
| |
| +-package.json
| |
| +-license-w.txt
|
+-pom.xml
|
+-license.txt

So, the project spanned by pom.xml in the root directory is considered to "own" all files below the root directory, including SUBPROJDIR_A/license-a.txt, SUBPROJDIR_B/license-b.txt and WEBDIR/license-w.txt. This means that scanner findings in those file will get associated to the root project.

However, when the scanner's view shifts to the projects in the subdirectories, the project spanned by SUBPROJDIR_A/pom-a.xml also gets the scan result for SUBPROJDIR_A/license-a.txt assigned (similar for the other subprojects).

This is historically so because ORT not really understand the semantics of a project's directory tree. However, the result can be really confusing, as scan findings (and potential violations) might show up multiple times in the reports, although they all stem from the same single file.

As a solution to this, one idea is to associate files always only to the nearest enclosing project when walking up the directory tree to the root. Maybe this logic should be limited to projects of the same type; however, in the example above this would result in the scanner findings from WEBDIR/license-w.txt to still be associated to the root project spanned by pom.xml.

I have some hopes that the required filtering logic would be easier to implement once #2668 is merged, as it implements some similar filtering to associate provenance-based scan results to individual packages IIUC.

Metadata

Metadata

Assignees

No one assigned

    Labels

    scannerAbout the scanner tool

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions