-
Notifications
You must be signed in to change notification settings - Fork 15
Description
Observation
When I was doing a POC of implementing the Instrumentation Score and corresponding rule evaluations I made the observation that the instrumentation score (the actual number) can feel misleading when the actual extent of rule violations is very low.
This is mainly due to the fact that the current formula for the instrumentation score treats individual rule evaluation results as a binary type (i.e. the rule has passed or not).
Example Scenario
Let's assume the following scenario.
- I have a lot of data and nearly all of the rules are violated.
- This would result in a very low instrumentation score (let's say
10), indicating that the quality of my data is really poor - But in reality, for each of the rules only a negligible amount of data is actually violating the rules.
(e.g. for ruleSPA-002there are only a couple of traces out of THOUSANDS that have orphan spans) - Such "negligible" violations may be not an issue in practice but degrade the instrumentation score significantly.
So, while the calculated score is 10, a more natural score for such a scenario would be 80 or so.
Proposal
What if the instrumentation score formula would include something like an extent per rule, that would take the above-described aspect into account?
How could that look like?
-
For each target type (i.e. SPA, LOG, MET) / rule we define the "extent unit". For SPA-type rules we would count number of traces or spans, for metrics number of metrics, for logs number of log entries, etc.
-
Based on that each rule evaluation
$r$ (for impact level$i$ ) would return the extent$E_{ir}$ (which is a number between 0 and 1). If the accurate extent cannot be calculated, an approximation might be good enough (for example through sampling of the data). The value per rule would have the following meaning-
0: no occurences of violations found --> same as the rule has passed / is successful -
0.4: 40% of the relevant data entities violate the rule (e.g. forSPA-00240% of traces have orphan spans) -
1: All of the relevant data is violating the rule
-
-
In the following spec part:
$P_i$ be the number of rules passed, or succeeded, for impact level$L_i$ .$P_i$ could be replaced with something like$$P_i = \sum_{j=1}^{R_i} (1 - E_{ij})$$ (with$R_i$ being the number of rules for impact level$i$ and$E_{ij}$ being the extent result of rule$j$ within that impact level)
The above equals the current definition for
Just dropping this here as an idea for discussion, but I feel that could improve the comparability / objectiveness aspect of the instrumentation score.