Skip to content

Webhooks: support flow process lifecycle events and connect those to Slack alerts #319

@zaychenko-sergei

Description

@zaychenko-sergei

Currently the flow system generates the following important process state lifecycle events via the outbox:

#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum FlowProcessLifecycleMessage {
    FailureRegistered(FlowProcessFailureRegisteredMessage),
    EffectiveStateChanged(FlowProcessEffectiveStateChangedMessage),
    TriggerAutoStopped(FlowProcessTriggerAutoStoppedMessage),
}

Apart from e-mailing, we should support these as webhook subscriptions.

Requirements:

  • creating a subscription at:
    - dataset level
    - account level (applies to all accessible datasets)
    - system level (for admins - applies to all datasets at all)
  • allow watching dataset flows only for now (don't do crazy things like webhook for a failure of webhook):
    - ingest / transform
    - compact / reset
  • new webhook event types: FLOW.PROCESS.FAILED, FLOW.PROCESS.STATECHANGE, FLOW.PROCESS.AUTOSTOP
    (we may start just from FLOW.PROCESS.STATECHANGE, it should be enough to build alerting)
  • UI to manage the subscriptions:
    - extend current dataset settings for webhooks
    - account-level
    - system-level for admins
  • learn to display these flows in UI
  • propagate the related initiating flow information:
    - either extend logic of flow sensors to dispatch failures, not just successes like now
    - or enrich and utilize flow_id from the outbox events, so that a proper activation cause could be built
  • prototype Slack application that handles these new webhooks
  • connect Slack application to engineering-alerts channel in Kamu workspace

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions