Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -296,6 +296,7 @@ private fun verifyBasicProject(result: ProjectAnalyzerResult) {
authors should beEmptyCollection()
isModified shouldBe false
isMetadataOnly shouldBe false
sourceCodeOrigins shouldBe emptyList()
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,8 @@ private fun Dependency.toPackage(): Package =
authors = authors,
labels = labels,
isModified = isModified ?: false,
isMetadataOnly = isMetadataOnly ?: false
isMetadataOnly = isMetadataOnly ?: false,

@fviernau fviernau Jun 8, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this can easily hide mistakes, e.g. user forgets to specify the provenance.
I would prefer instead to add an attribute sourceCodeOrigins which the user can specify. What do you think?

side note: if only one of sourceArtifact or vcs is specified, sourceCodeOrigins still contains both (which is fine imo), but now slightly inconsistent.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer instead to add an attribute sourceCodeOrigins which the user can specify.

What should the default of this attribute be? If it's an emptyList() I'm fine with it, if it would be null this wouldn't solve the issue that I'm trying to solve with this PR.

Having dependencies without provenance is by design as described here in the minimal example. If the user is not specifying the provenance intentionally or accidentally can't be known by ORT.

Additionally, adding a sourceCodeOrigins attribute would also require us to handle something like:

  - purl: "pkg:maven/org.apache.commons/commons-lang3@3.14.0"
    sourceArtifact:
      url: "https://repo1.maven.org/maven2/org/apache/commons/commons-lang3/3.14.0/commons-lang3-3.14.0-sources.jar"
    sourceCodeOrigins: VCS

Which could only result in an error.

I'm not sure what benefit this attribute would provide.
If a user wants to prioritize VCS over sourceArtifact (or vice versa), they would only configure one of them.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should the default of this attribute be? If it's an emptyList() I'm fine with it, if it would be null this wouldn't solve the issue that I'm trying to solve with this PR.

The default value imho should be null, analog to the default value of Package for consistency.
The semantics of null means, that the value configured here should be used.

Having dependencies without provenance is by design as described #10182 (comment). If the user is not specifying the provenance intentionally or accidentally can't be known by ORT.

An attempt to run the license scanner against a package without provenance fails by design, if not
explicitly curated in a way that the scanner skips the scanning. Why should it be different for OrtProjectFile?

Additionally, adding a sourceCodeOrigins attribute would also require us to handle something like:

I believe no handling would need to be added, and ORT's behavior as-is fits: The scanner would report an issue that it could not scan any provenance, which makes sense, because it is configured to try VCS only, for which it does not know the provenance.

I'm not sure what benefit this attribute would provide.

In my view, this would fit consistently into the existing design.

If in the downloader configuration you specify [VCS, SOURCE_ARTIFACT], then it would work for you also without specifying any sourceCodeOrigins for the package, or?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An attempt to run the license scanner against a package without provenance fails by design, if not
explicitly curated in a way that the scanner skips the scanning. Why should it be different for OrtProjectFile?

Because it was decided in #10182 and documented that this should be the minimal working example:

dependencies:
- purl: "pkg:maven/com.example/minimal@0.1.0"

It was not my understanding that this minimal example will only work with the ORT analyzer.

When used with the ORT Scanner, this example cannot be used because it will create a scanner issue, if the user is not specifying any of the attributes below:

  • VCS
  • sourceArtifact
  • isMetadataOnly

What would be your approach to make the minimum example work?

If in the downloader configuration you specify [VCS, SOURCE_ARTIFACT], then it would work for you also without specifying any sourceCodeOrigins for the package, or?

Not if I understand you correctly, because I don't want the scanner to try to download anything if the minimal example is being used.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it was decided in #10182 and documented

Would have have a more fine grained pointer. I've skimmed through this rather large ticket and could
not find that decision.

What would be your approach to make the minimum example work?

My proposal is to enhance the minimal example for running (analyzer + scanner) as follows:

dependencies:
- purl: "pkg:maven/com.example/minimal@0.1.0"
  source_code_origins: []

For just running analyzer, it could remain as-is.
I do not really understand why the scanner is run is no provenance is defined, and why the scanner should
not flag missing provenance, only for ORT project files but in all other cases.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a remark: The whole purpose of this pseudo package manager is to make it really simple for users to describe their dependencies. Although it is called OrtProjectFile, it may be therefore useful to come up with a DSL that better supports this goal. From a user's point of view, the proposed source_code_origins property and its consequences may not be easy to understand. Could there be a simpler way to indicate whether the package should be handled by the scanner? Just throwing in ideas:

  • There could be a source property with possible values like a VCS URL, an artifact URL, and some reserved value like IGNORE.
  • There could be a different indicator that the definition file uses a minimum syntax, maybe based on file naming conventions. In this case, different default values could be used for some properties.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would have have a more fine grained pointer. I've skimmed through this rather large ticket and could
not find that decision.

Yes, sorry.
The decision was made in a TSC meeting linking to a specific comment.

https://github.com/oss-review-toolkit/ort-governance/blob/main/minutes/2025-12-02.md#agenda

Overall format as outlined by #10182 (comment) is agreed (with above modifications). Only tiny stuff left for discussion in the PR.

My proposal is to enhance the minimal example for running (analyzer + scanner) as follows:

dependencies:
- purl: "pkg:maven/com.example/minimal@0.1.0"
  source_code_origins: []

Requiring that a user sets an empty list to specify that something is absent makes this more complex than it has to be. I would like to avoid this just to guard against "users that might forget to set the provenance".

I do not really understand why the scanner is run is no provenance is defined

Because the file format is designed to support both, dependencies with provenances and dependencies without it. My PR changes exactly that. The scanner is (implicitly) skipped if no provenance is set.

and why the scanner should not flag missing provenance, only for ORT project files but in all other cases.

Because when designing this format, we also had the use-case in mind that projects like this have to be supported:

dependencies:
  - purl: "pkg:maven/com.example/with-vcs@1.1.0"
    description: "Package with VCS defined, we expect a detected license to be present."
    vcs:
      type: "Git"
      url: "https ://git.example.com/with-vcs/"
      revision: "main"

  - purl: "pkg:maven/com.example/declared-only@1.1.0"
    description: |
      Package without any provenance, we only expect a declared license to be available.
    declaredLicenses:
      - "MIT"

  - purl: "pkg:maven/com.example/minimal@1.1.0"
    description: |
      Package without any provenance and license information.
      This might be used for self written packages that are scanned elsewhere,
      or for proprietary packages where no source code is available.
    declaredLicenses:
      - "LicenseRef-proprietary"

For other package managers a missing provenance should be flagged, because the metadata is not specified by ORT users but by maintainers of the (OSS) dependencies.

@fviernau fviernau Jun 10, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decision was made in a TSC meeting linking to a specific comment.
https://github.com/oss-review-toolkit/ort-governance/blob/main/minutes/2025-12-02.md#agenda

I still cannot find an indication that silencing failing provenance resoltion was considered in that TSC discussion.

Requiring that a user sets an empty list to specify that something is absent

Well, we have the DownloaderConfiguration, and allows to configure sourceCodeOrigins. And semantics is, that the configured value is used by the downloader / scanner, unless an override is specified. So, as you phrased it above is imho misleading, because it sounds as if the user has to specify something explicitly, which is emtpy implicitly anyway, but I believe this is not the case.

Because the file format is designed to support both, dependencies with provenances and dependencies without it.

Just to make clear, with my above proposal I did not mean to add any limitation at all.
To my understanding, and my intention was, to still support all kinds of use cases.

In particular, also the use case to have the scanner flag issues in case there is no provenance, and not
to make this go unnoticed by default.

I'm not 100% sure if there is a misunderstanding still, or if we simply disagree. However, I would be interrested also in input from others.

@mnonnenmacher mnonnenmacher Jun 10, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my understanding the "minimal working example" that was discussed in the design phase was referring to the package manager itself, not that any subsequent tools like the scanner could not create issues.

Silently ignoring the package in the scanner if no source code location was provided also has the risk that users get the false impression that the license detection did not find any issues, so I would also prefer that users have to make this decision consciously. Otherwise, it could also be that they have just forgotten to provide the source code location and will never notice because no warning of error is shown.

If source_code_origins: [] is too unintuitive, what about adding a more speaking property like skip_scanner: true or source_code_available: false which is then internally mapped to source_code_origins: []?

Just for reference, neither of the proposed solutions follows ORT's philosophy that users should provide a reason why it is ok to ignore this package in the scanner like they would have when creating an issue resolution or a curation.

sourceCodeOrigins = if (sourceArtifact == null && vcs == null) emptyList() else null
Comment thread
sschuberth marked this conversation as resolved.
)

private fun Vcs.toVcsInfo(): VcsInfo =
Expand Down
Loading