Skip to content

Thoughts about / proposal for incorporating an extent into the score #43

@AlexanderWert

Description

@AlexanderWert

Observation

When I was doing a POC of implementing the Instrumentation Score and corresponding rule evaluations I made the observation that the instrumentation score (the actual number) can feel misleading when the actual extent of rule violations is very low.

This is mainly due to the fact that the current formula for the instrumentation score treats individual rule evaluation results as a binary type (i.e. the rule has passed or not).

Example Scenario

Let's assume the following scenario.

  • I have a lot of data and nearly all of the rules are violated.
  • This would result in a very low instrumentation score (let's say 10), indicating that the quality of my data is really poor
  • But in reality, for each of the rules only a negligible amount of data is actually violating the rules.
    (e.g. for rule SPA-002 there are only a couple of traces out of THOUSANDS that have orphan spans)
  • Such "negligible" violations may be not an issue in practice but degrade the instrumentation score significantly.

So, while the calculated score is 10, a more natural score for such a scenario would be 80 or so.

Proposal

What if the instrumentation score formula would include something like an extent per rule, that would take the above-described aspect into account?

How could that look like?

  • For each target type (i.e. SPA, LOG, MET) / rule we define the "extent unit". For SPA-type rules we would count number of traces or spans, for metrics number of metrics, for logs number of log entries, etc.

  • Based on that each rule evaluation $r$ (for impact level $i$) would return the extent $E_{ir}$ (which is a number between 0 and 1). If the accurate extent cannot be calculated, an approximation might be good enough (for example through sampling of the data). The value per rule would have the following meaning

    • 0: no occurences of violations found --> same as the rule has passed / is successful
    • 0.4: 40% of the relevant data entities violate the rule (e.g. for SPA-002 40% of traces have orphan spans)
    • 1: All of the relevant data is violating the rule
  • In the following spec part:

    $P_i$ be the number of rules passed, or succeeded, for impact level $L_i$.

    $P_i$ could be replaced with something like $$P_i = \sum_{j=1}^{R_i} (1 - E_{ij})$$ (with $R_i$ being the number of rules for impact level $i$ and $E_{ij}$ being the extent result of rule $j$ within that impact level)

The above equals the current definition for $E_{ij}$ values of 0 or 1, but for any values in between the instrumentation core would take into account that only a part of the data is affected.

Just dropping this here as an idea for discussion, but I feel that could improve the comparability / objectiveness aspect of the instrumentation score.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions