Skip to content

Identifier type aliases #9900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

Identifier type aliases #9900

wants to merge 3 commits into from

Conversation

mnonnenmacher
Copy link
Member

@mnonnenmacher mnonnenmacher commented Feb 6, 2025

Add a mechanism to define aliases for renamed identifier types to ensure package curations and package configurations can still be matched after a renaming. Please see the commit messages for details.

@mnonnenmacher
Copy link
Member Author

@oss-review-toolkit/core-devs This is an attempt to solve the issue we currently have with the renaming of SpdxDocumentFile to SpdxDocument. This affects so many of our users, that we desperately need to revert that change. However, I wanted to try to find a better solution for the future, and this PR is an attempt to introduce backward compatibility. Unfortunately our architecture prevents implementing a clean solution.

I would like to get your feedback if you would prefer this approach or reverting the change, or if you have a better idea. But for the future we need to find a way to make such changes backward compatible. I'm out of office tomorrow, but we need to find a solution for this by Monday, so early feedback is appreciated.

* Return whether the [type][Identifier.type] of this [Identifier] matches the type of [other] by comparing them
* case-insensitively.
*/
fun Identifier.typeMatches(other: Identifier) = type.equals(other.type, ignoreCase = true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about naming this typeEquals() to align with the underlying implementation (and imo better fits)?

id.typeMatches(Identifier.EMPTY.copy(type = "TYPE")) shouldBe true
}

"return false if the type does not match" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better "is not equal ignoring case" instead of "not match", for symmetry.

Make the matching of the identifier type in package configurations case
insensitive to align with package curations.

Signed-off-by: Martin Nonnenmacher <[email protected]>
This prepares for extending the logic in a following commit.

Signed-off-by: Martin Nonnenmacher <[email protected]>
Add a map to define aliases for renamed identifier types and use that
when matching types.

Ideally, the aliases should come from the package manager
implementations where the renaming takes place, but that is not possible
because the matching logic is inside the `model` module which is a
dependency of the package manager plugins.

Signed-off-by: Martin Nonnenmacher <[email protected]>
@mnonnenmacher mnonnenmacher force-pushed the identifier-type-aliases branch from 6903b70 to aa94156 Compare February 6, 2025 16:19
@mnonnenmacher
Copy link
Member Author

@fviernau Before I address the comments, would you agree with the general approach?

Copy link

codecov bot commented Feb 6, 2025

Codecov Report

Attention: Patch coverage is 40.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 68.41%. Comparing base (c96f2e8) to head (aa94156).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
...del/src/main/kotlin/config/PackageConfiguration.kt 40.00% 0 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #9900      +/-   ##
============================================
- Coverage     68.43%   68.41%   -0.02%     
  Complexity     1308     1308              
============================================
  Files           250      250              
  Lines          8880     8884       +4     
  Branches        921      924       +3     
============================================
+ Hits           6077     6078       +1     
  Misses         2413     2413              
- Partials        390      393       +3     
Flag Coverage Δ
funTest-non-docker 33.35% <0.00%> (-0.03%) ⬇️
test-ubuntu-24.04 36.13% <40.00%> (-0.01%) ⬇️
test-windows-2022 36.11% <40.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@fviernau
Copy link
Member

fviernau commented Feb 6, 2025

@fviernau Before I address the comments, would you agree with the general approach?

double check: The change only switch from equals to equalsIgnoreCase right ?

@MarcelBochtler
Copy link
Member

@fviernau Before I address the comments, would you agree with the general approach?

double check: The change only switch from equals to equalsIgnoreCase right ?

Not sure if I'm getting the question right, but this change is not only a change to ignore the case, it also introduces a mapping for backwards compatibility.
See the 3rd commit of this PR, especially this:

private val typeAliases = mapOf(
    "spdxdocument" to "spdxdocumentfile"
)

Would you agree with this approach, @fviernau?

@fviernau
Copy link
Member

fviernau commented Feb 7, 2025

See the 3rd commit of this PR

Indeed, I missed the third commit. (Which is the most important one I guess)

Would you agree with this approach, @fviernau?

Hm, so can you explain why it is not possible to change the type in your customers curations?
With the proposal, I do not see why exactly this would be needed in similar cases in the future, and I also do not see the plan how to get rid of that code again.

I guess, the problem is that the curations are not in a cenral repository, but in ort.yml files.
Users are not aware of the curation not matchin anymore, so a large amount of issues would pop up
leading to an inflow of support requests. If the users were aware of the issue, they could (I guess) fix
it real quick patch up their ort.yml and the problem would be solved. How to make users aware?

  1. you have centrally controlled polcy rules -> make a violation which flags use of renamed type in ort.yml
  2. otherwise -> make ort create an analyzer error for such issues in ort.yml

What do you think?

@MarcelBochtler
Copy link
Member

MarcelBochtler commented Feb 7, 2025

I guess, the problem is that the curations are not in a cenral repository, but in ort.yml files.

ORT offers both, curations in a central repository and curations in .ort.yml files.
We make use of both possibilities.

If the users were aware of the issue, they could (I guess) fix it real quick patch up their ort.yml and the problem would be solved.

Yes, the individual user can fix the issue quickly.
But requiring hundreds of users fixing their .ort.yml file for a rename for style reasons, is not feasible from our point of view.

  1. you have centrally controlled polcy rules -> make a violation which flags use of renamed type in ort.yml
  2. otherwise -> make ort create an analyzer error for such issues in ort.yml

Yes, that would all work, but as said, we don't think that it was worth breaking the users .ort.yml files in the first place, and we underestimated the impact what this change would have in the first place.

Do I read your message correctly, that you are against both laid out proposals by @mnonnenmacher?

  1. This proposal -> Make the change backwards compatible
  2. Reverting the change regarding the renaming

@fviernau
Copy link
Member

fviernau commented Feb 7, 2025

Do I read your message correctly, that you are against both laid out proposals by @mnonnenmacher?

I'm not against them. I just wanted to figure out whether there is an alternative.

My preference would be to revert the change, but if you prefer to merge this PR as a solution
that'd be ok for me too, as I don't have any better idea.

edit: Would a revert actually still guarantee that Project and Package use a different type for the SpdxDocumentFile analyzer?

@fviernau fviernau dismissed their stale review February 7, 2025 08:53

see my comment

@fviernau
Copy link
Member

fviernau commented Feb 7, 2025

one more idea to share: alternatively the de-serialization could be adjusted to do the mapping.
This would ensure that no other place would have to deal with this type matches, and we could still compare Identifier.toCoordinates() with == e.g. when debugging.

/**
* A map of lowercase aliases for types that were renamed.
*/
private val typeAliases = mapOf(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether a map is the right data structure here. To me, aliases are a set of alternative names, not a one-to-one relationship. I'd propose more something like we have for VcsType aliases, i.e. a list of names that are all treated equally (when consuming), and by convention use the first entry in the list when producing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is not possible because the matching logic is inside the model module which is a dependency of the package manager plugins.

Well, the matching logic could stay in the model, but the definition of aliases could still be moved to the plugins, simply by making projectType a list of aliases instead of a single string, as described above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should the logic in model get access to the aliases if they are defined in Gradle modules that depend on model? It doesn't even know what projectTypes the plugins define.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should the logic in model get access to the aliases if they are defined in Gradle modules that depend on model?

Well, basically by creating instances via all factories for the package manager entry point. That super-dump, but technically possible, I believe.

It doesn't even know what projectTypes the plugins define.

Yes, that the projectTypes are not known to the factories already has caused me trouble with #9780 as well...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, basically by creating instances via all factories for the package manager entry point. That super-dump, but technically possible, I believe.

The model doesn't know about package manager factories because the interface is part of the analyzer module. I think a proper implementation requires to move the matching logic to a separate module which can depend on the analyzer for that reason.

@sschuberth
Copy link
Member

This is an attempt to solve the issue we currently have with the renaming of SpdxDocumentFile to SpdxDocument. This affects so many of our users, that we desperately need to revert that change.

It could be that a bug slipped in when introducing the projectType for the SpdxDocumentFile package manager, namely

should probably not unconditionally use projectType. I need to look into this again.

Also, I'd like to better understand how exactly your users are affected. Does it mostly break package curations, or package configurations? And for the latter, configurations for packages or for projects?

I'm not 100% convinced yet that we need such an alias mechanism going forward, but it could become useful again when adding a "Project" suffix to the ID types of projects. That's a use-case to keep in mind.

@fviernau
Copy link
Member

It could be that a bug slipped in when introducing the projectType for the SpdxDocumentFile package manager, namely

Nice finding. Could this be the root cause of the observed issue?

@mnonnenmacher
Copy link
Member Author

My preference would be to revert the change, but if you prefer to merge this PR as a solution that'd be ok for me too, as I don't have any better idea.

I'm not very happy about this implementation either, so I have created #9910 to revert the change instead.

edit: Would a revert actually still guarantee that Project and Package use a different type for the SpdxDocumentFile analyzer?

No, but that is neither the case now nor was it the case before.

@mnonnenmacher
Copy link
Member Author

mnonnenmacher commented Feb 10, 2025

Also, I'd like to better understand how exactly your users are affected. Does it mostly break package curations, or package configurations? And for the latter, configurations for packages or for projects?

It's both, package configurations and package curations defined in .ort.yml files. We have more than 50 affected repositories so the effort to update all of them is huge, especially if the reason for the change is mainly cosmetic right now.

When I reviewed the PR that introduced the change I completely underestimated the consequences.

I'm not 100% convinced yet that we need such an alias mechanism going forward, but it could become useful again when adding a "Project" suffix to the ID types of projects. That's a use-case to keep in mind.

I believe we need such a mechanism for user facing configuration changes to allow for smooth migrations. If we add the "Project" suffix we cannot expect our users to update hundreds of repositories from one day to the other without having a way to at least give them some time for the migration.

@mnonnenmacher
Copy link
Member Author

Closed in favor of #9910.

@mnonnenmacher mnonnenmacher deleted the identifier-type-aliases branch February 10, 2025 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants