Skip to content
This repository was archived by the owner on Dec 5, 2019. It is now read-only.
This repository was archived by the owner on Dec 5, 2019. It is now read-only.

Clean Up telemetry-alerts notification #613

@fbertsch

Description

@fbertsch

I think there are too many notifications coming out from ATMO. For example failing-job should probably be removed - it's just an expected failure. Lots of jobs are failing every day, and makes it difficult to parse which are important and which aren't.

Maybe we can have some sort of tiered alerts, e.g.:

  1. We alert on the first failure after a success
  2. We alert on each Nth failure after the first failure (N=7 would mean once a week)
  3. We ensure follow-up on failure, and require job remove/deactivation after some Mth failure.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions