Skip to content

Flows: additional emails for platform operator's UX #318

@zaychenko-sergei

Description

@zaychenko-sergei

Currently, whenever a flow fails, the API server issues a simple email about it.

This is better then nothing, but sometimes causes alert fatigue.

Some of the better ideas are:

  • only report 1st failure or critical failure (when flow process status changes)
  • email when processing is auto-stopped
  • daily summary email (statistics of executed flows with good hyperlinks).

The outbox events that are used to drive the emails have the following structure and they are rich enough to implement smarter behaviors. In particular, there is an indication of new_consecutive_failures, which can be used to filter out secondary failures.

#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum FlowProcessLifecycleMessage {
    FailureRegistered(FlowProcessFailureRegisteredMessage),
    EffectiveStateChanged(FlowProcessEffectiveStateChangedMessage),
    TriggerAutoStopped(FlowProcessTriggerAutoStoppedMessage),
}


#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct FlowProcessFailureRegisteredMessage {
    /// The time at which the event was recorded
    pub event_time: DateTime<Utc>,

    /// The binding of the flow process to which the flow belongs
    pub flow_binding: FlowBinding,

    /// The unique identifier of the flow
    pub flow_id: FlowID,

    /// The associated error outcome
    pub error: ts::TaskError,

    /// Number of consecutive failures for the flow process, including this
    /// failure
    pub new_consecutive_failures: u32,
}

pub struct FlowProcessEffectiveStateChangedMessage {
    /// The time at which the event was recorded
    pub event_time: DateTime<Utc>,

    /// The binding of the flow process to which the flow belongs
    pub flow_binding: FlowBinding,

    /// The previous effective state
    pub old_effective_state: FlowProcessEffectiveState,

    /// The new effective state
    pub new_effective_state: FlowProcessEffectiveState,
}

pub struct FlowProcessTriggerAutoStoppedMessage {
    /// The time at which the event was recorded
    pub event_time: DateTime<Utc>,

    /// The binding of the flow process to which the flow belongs
    pub flow_binding: FlowBinding,

    /// The reason for auto-stopping
    pub reason: FlowProcessAutoStopReason,
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions