Skip to content

Weighted average formula doesn't match stated "grade capping" behavior #44

@kuklyy

Description

@kuklyy

The spec says the formula caps scores when Critical rules fail (like SSL Labs), but the math doesn't actually do that.

What the spec claims:

"This structure ensures that major deficiencies act as a significant deterrent, potentially capping the achievable score, aligning with lessons from prior art like SSL Labs."

"Instrumentation Score's formula using Critical/Important rules directly mirrors [SSL Labs'] effective approach."

What it actually does:

Score = (Σ(P_i × W_i)) / (Σ(T_i × W_i)) × 100

This is a weighted average. It doesn't cap anything - Critical rules just get more weight (40 vs 10).

Example:

Service missing service.name (Critical) with 40 total rules:

  • 10 Critical: 9 pass, 1 fail (weight = 40)
  • 10 Important: 6 pass, 4 fail (weight = 30)
  • 10 Normal: 8 pass, 2 fail (weight = 20)
  • 10 Low: 5 pass, 5 fail (weight = 10)
Score = ((9×40) + (6×30) + (8×20) + (5×10)) / ((10×40) + (10×30) + (10×20) + (10×10)) × 100
      = (360 + 180 + 160 + 50) / (400 + 300 + 200 + 100) × 100
      = 750 / 1000 × 100
      = 75.0 ("Good")

Should a service with a Critical failure score "Good"?

Comparison:

Approach 1 Critical fail + mixed Perfect except 1 Critical
Current 75.0 ("Good") 96.0 ("Excellent")
SSL Labs ≤ 74 ("Needs Improvement") ≤ 74 ("Needs Improvement")

Weighted average: Critical failures hurt more, but you can compensate with other passing rules.

SSL Labs capping: Missing TLS 1.2 = max grade C, period.

The dilution problem:

As we add more rules to improve coverage, the impact of any single Critical failure gets diluted in the weighted average.

Questions:

  1. Is this intentional? The spec text suggests capping was the goal.
  2. What should "Critical" mean if it doesn't block good scores?
  3. Should we fix the formula or update the spec text?

Users will expect Critical failures to matter. As the rule set grows, services will be able to achieve "Excellent" scores despite Critical failures.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions