Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add exclude and include check in ini config #254

Merged
merged 17 commits into from
Feb 21, 2024

Conversation

tromai
Copy link
Member

@tromai tromai commented May 29, 2023

Close #133.

Feature description

Configuring exclude/include in Ini config.

We want to allow the users to:

  • exclude checks that they don't want to run.
  • include the checks that they want to run.

Because checks in Macaron depend on each other (using the depends_on attribute). Therefore, when we want to disable a check (from users' exclude/include configuration), we also need to handle its transitive parents and children.

  • EXPLICIT_INCLUDE = checks included from the user configuration
  • EXPLICIT_EXCLUDE = checks included from the user configuration
  • EFFECTIVE_INCLUDED = EXPLICIT_INCLUDE set plus transitive parents
  • EFFECTIVE_EXCLUDED = EXPLICIT_EXCLUDE set plus transitive children
  • FINAL_INCLUDED = EFFECTIVE_INCLUDED minus EFFECTIVE_EXCLUDED

The user should define the list of str with each element as an ID of a check or a glob pattern (similar to patterns used in glob) which will be used to match multiple check IDs. This configuration is located in the defaults.ini file.

[analysis.checks]
# By default, we don't exclude any checks.
exclude =
# By default, we run all checks available.
include = *

Example on glob patterns:

  • mcn_* would match all checks with IDs start with mcn_.
  • * would match all checks registered in Macaron.
  • mcn_boo_1 would match a single check with ID mcn_boo_1

Displaying the run checks

  • In the log messages, we want to display the list of checks that are going to run and excluded before the analysis.
  • In the HTML reports, we want to have a section just before the Check report table
    • Show all the checks in a "tree-like" view
    • The excluded checks are faded-out or crossed out indicating its disabled status.
    • This section is in a drop down section so that we could hide it when the user first open the report.

For example, this could be what we display in the HTML reports given that mcn_build_as_code_1 is excluded.

image

Implementation

Background

When a check is registered in Macaron, we update the 3 following attribute of the Registry class:

  • _all_checks_mapping : this is the dictionary that maps between the check id (type str) and the instance of a check (inherit from BaseCheck).
  • _check_relationships_mapping:
    • Has type dict[str, dict[str, CheckResultType]]:
      • Key (str): the ID of the parent check
      • Value (dict[str, CheckResultType]): the mapping between the children that depend on the parent check to the check status that each of them depend on (e.g CheckResultType.PASSED):
        • Key (str): the id of a child check
        • Val (CheckResultType): the status that this child check depends on.
    • This dictionary is used to make sure that the relationships are defined correctly (e.g. one check cannot depend on multiple parents).
  • _graph: this is the final graph where we traverse to get the next available check in the correct topological order when we run the analysis.
    • This graph is updated each time a new check is registered.
    • This graph will add the necessary edges between a newly registered check and its parent.

To describe those attributes conceptually:

  • _all_checks_mapping: includes the nodes in the check relationship tree.
  • _check_relationships_mapping: it is sort of an adjacency list that described the edges in the check relationship tree.
  • _graph: is built from the 2 above attributes and is used for getting the topological order of the checks.

Solution

  • The list of run checks are computed when the method registry.prepare() is run. This list will be stored in this attribute.
  • Even though we have known the list of run checks and excluded checks, _all_checks_mapping, _check_relationships_mapping and _graph still contain all of the registered checks.
  • When the checks are run within the registry.scan() method (which still goes through the check tree will all registered checks), checks which are not included in this list will not be run. Therefore, the result for those checks will not be included in the reports and the database.
  • Please see commit bb3178f for more details.

@tromai tromai added the feature A new feature request label May 29, 2023
@tromai tromai self-assigned this May 29, 2023
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label May 29, 2023
@tromai tromai force-pushed the 133-allow-disabling-a-check branch from eddf950 to 1fbdcab Compare June 5, 2023 01:34
@tromai tromai marked this pull request as ready for review June 13, 2023 04:08
@tromai tromai requested a review from behnazh-w as a code owner June 13, 2023 04:08
@tromai tromai force-pushed the 133-allow-disabling-a-check branch from 19f6679 to 952db85 Compare June 14, 2023 05:01
@tromai tromai marked this pull request as draft June 14, 2023 05:16
@tromai tromai marked this pull request as ready for review June 14, 2023 05:16
@tromai tromai force-pushed the 133-allow-disabling-a-check branch from 952db85 to c06dbd9 Compare June 14, 2023 05:53
@tromai tromai changed the title feat: add exclude and include check feat: add exclude and include check in ini config Jun 15, 2023
@tromai tromai force-pushed the 133-allow-disabling-a-check branch from 7556972 to d8758d0 Compare June 15, 2023 03:08
@tromai tromai marked this pull request as draft July 25, 2023 01:36
@tromai tromai added the checks The issues related to Macaron checks label Aug 22, 2023
@tromai tromai linked an issue Aug 22, 2023 that may be closed by this pull request
2 tasks
@tromai tromai marked this pull request as ready for review February 11, 2024 08:07
@nathanwn
Copy link
Member

I'm not sure if this is in-scope of this PR, but I am seeing some issues with the attributes _graph and _is_graph_ready in the Registry class:

https://github.com/oracle/macaron/blob/73962d8af12b0fcca08afe54323de266ac6c826b/src/macaron/slsa_analyzer/registry.py#L50C1-L60C28

These attributes serve the purpose of working out the correct execution order of checks. I think the class scope is too wide of a scope for them: they are only needed once we start executing the checks, and not needed when the Registry object is created and we are still population checks into the registry. The two attributes _all_checks_mapping and _check_relationships_mapping already serve the purpose of storing the check graph, so the _graph attribute is not needed for this purpose. One evidence of the _graph attribute being scoped too wide is the fact that we need to create a deep copy of it whenever we start executing the checks:

https://github.com/oracle/macaron/blob/73962d8af12b0fcca08afe54323de266ac6c826b/src/macaron/slsa_analyzer/registry.py#L471C1-L472C42

If we need to deepcopy like this, then it is a sign that this _graph: TopologicalSorter object is not supposed to be a long-living object, but rather something that should be used once and discarded. Hence, it is probably better to recreate a new topological sorter from the graph representation (i.e. _all_checks_mapping and _check_relationships_mapping) every time we need to work out check order.

I realized this issue when I looked at the unit tests that involve Registry objects: the creation of a Registry object also forces us to create a TopologicalSorter, which initially looks weird to me.

Furthermore, now that we have incorporated the include and exclude checks feature, the topological sorter can be created after the set of checks to be run has been worked out, which means it can just take these checks instead of all registered checks. However, I consider this to be not as relevant as an improvement compared to removing _graph and _is_graph_ready from the class scope of Registry.

@nicallen
Copy link
Member

nicallen commented Feb 16, 2024

I'm not sure if this is in-scope of this PR, but I am seeing some issues with the attributes _graph and _is_graph_ready in the Registry class...

I agree that the Registry class is doing too much and has overly convoluted statefulness going on. As I see it, there are 3 main phases of the Registry class state, and different operations are incorrect to invoke during each, making it confusing to use since it is not obvious when something is safe to call or what invariants hold at what times.

The 3 phases are: when checks are being registered (the check graph is mutable, checks referenced may not yet exist, not safe to invoke operations that need a stable, complete graph), after check registration is finished, but topological sorting has not been done, and finally after topological sorting is done (not safe to repeat topological sorting).

An improvement would be to refactor these phases into 3 separate classes, maybe something like CheckRegistry which only allows adding checks, then an operation that validates and finalises the graph, returning a CheckGraph, which has the operations for traversing the complete graph (and doesn't have the ability to add further checks), and when you actually need to run the checks, a CheckSchedule can be constructed which takes the CheckGraph and computes the topologically-sorted run order.

This is a general improvement suggestion, this doesn't have to be done as part of this PR.

…ifier to scorecard and exclude the mcn_provenance_level_three_1 check for e2e output of slsa-verifier integration test
…lsa-verifier test case in integration_test_docker
@tromai
Copy link
Member Author

tromai commented Feb 18, 2024

Thanks @nicallen @nathanwn for the feedback regarding the current state of our Registry class. I agree with both comments about the limitation of it. The issues everyone described there are what I encountered (somewhat) during the design and implementation of this feature. I will create a issue for this improvements of registering and scheduling the checks.
Update: the GH issue has been created #645

Copy link
Member

@nathanwn nathanwn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for working on this PR.

src/macaron/slsa_analyzer/registry.py Outdated Show resolved Hide resolved
scripts/dev_scripts/integration_tests.sh Outdated Show resolved Hide resolved
src/macaron/config/defaults.ini Outdated Show resolved Hide resolved
src/macaron/graph/graph.py Outdated Show resolved Hide resolved
src/macaron/config/defaults.ini Show resolved Hide resolved
src/macaron/output_reporter/templates/base_template.html Outdated Show resolved Hide resolved
src/macaron/slsa_analyzer/analyzer.py Outdated Show resolved Hide resolved
src/macaron/slsa_analyzer/registry.py Outdated Show resolved Hide resolved
@tromai tromai merged commit 744335d into staging Feb 21, 2024
10 checks passed
@tromai tromai deleted the 133-allow-disabling-a-check branch February 21, 2024 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
checks The issues related to Macaron checks feature A new feature request OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow disabling a check
5 participants