You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is useful for alerting and knowing that failures occurred, but:
It is aggregated
It does not indicate which Flow or Job failed
2. Jobs API
Example:
GET /api/w/{workspace}/jobs/list?script_path_start=...
This allows us to identify failed jobs and see flow/script paths and error messages.
However:
It is pull-based, meaning we have to continuously query the API to detect failures
Ideally, we would prefer a push-based mechanism, where Windmill actively notifies us when a Flow or Job fails (e.g. via webhook or event), instead of us polling the Jobs API to find out what errored.
3. Error handling inside Flows
Handling errors inside Flows (try/catch, error branches, notifications) is helpful for expected errors, but:
It does not cover crashes or worker-level failures
It requires per-flow implementation
Questions
What approach would you recommend for detecting failed Flows in a way that:
Works well with alerting
Identifies the exact Flow
Scales well for future Flows with minimal changes?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi
We’re looking for a recommended way to detect whether a specific Flow has failed, in a scalable and future-proof way.
What we’ve checked so far
1. Metrics
Example:
This is useful for alerting and knowing that failures occurred, but:
2. Jobs API
Example:
This allows us to identify failed jobs and see flow/script paths and error messages.
However:
Ideally, we would prefer a push-based mechanism, where Windmill actively notifies us when a Flow or Job fails (e.g. via webhook or event), instead of us polling the Jobs API to find out what errored.
3. Error handling inside Flows
Handling errors inside Flows (try/catch, error branches, notifications) is helpful for expected errors, but:
Questions
What approach would you recommend for detecting failed Flows in a way that:
Thanks in advance 🙏
Beta Was this translation helpful? Give feedback.
All reactions