Description
Currently, the license choice is applied to the "effective license", which is done as follows.
It has severe performance issues depending on the input, which this proposal tries to improve on.
- Create one large SPDX expression which combines each declared license and each detected license with an AND (operator),
which can result in quite large SPDX expressions. - Compute the set of "valid choices" for that SPDX expression and then apply all choices which are valid choices in
context of the respective package.
Problem: Step 2 can be really really slow, if the amount of distinct licenses is large and contains several OR operators.
Proposal:
- Apply the license choices (as in above step 2) separately, to each distinct declared license and to each distinct detected license.
- Then combine the result into a larger SPDX expression joining with the AND operator, to obtain the effective license.
Idea: With the new approach, the license choices are computed for quite a bit smaller SPDX expressions. This should
drastically increase the performance, since solving NP complete problems for larger N becomes relatively hard relatively
quickly, while N is kept smaller here.
Note: There is an unrelated inconsistency, which may become a bit related. The effective license in the policy rules
is computed from the excluded + non-excluded detected licenses. The reason for this was to keep the +-isExcluded()
working in the rules. However, it leads to different results, compared to the reporters, which first filter excluded and then
apply the choice. E.g. the given
part of the choice may match in one case but not in the other.