Endpoint to get latest step instance views across runs for a given workflow instance #204
Merged
anjujha merged 1 commit intoNetflix:mainfrom Apr 9, 2026
Merged
Conversation
…workflow instance
Adds GET /{workflowId}/instances/{workflowInstanceId}/steps which returns the most recent step attempt per step across all runs, useful for workflows restarted from failure where steps ran in different runs.
akashdw
reviewed
Apr 9, 2026
| INNER_RANK_QUERY_ALL_FIELD_WITH | ||
| + ", ROW_NUMBER() OVER (PARTITION BY step_id ORDER BY workflow_run_id DESC, step_attempt_id DESC) AS rank" | ||
| + GET_STEP_FIELD_QUERY_FROM | ||
| + ") SELECT * FROM inner_ranked WHERE rank=1"; |
Collaborator
There was a problem hiding this comment.
If you have any benchmarks or EXPLAIN / query plan results, could you share those as well?
Collaborator
Author
There was a problem hiding this comment.
I have pasted the query plan below
QUERY PLAN
Subquery Scan on inner_ranked (cost=56.59..57.05 rows=1 width=1707) (actual time=1.327..1.486 rows=211 loops=1)
Filter: (inner_ranked.rank = 1)
-> WindowAgg (cost=56.59..56.88 rows=13 width=1707) (actual time=1.326..1.471 rows=211 loops=1)
Run Condition: (row_number() OVER (?) <= 1)
-> Sort (cost=56.59..56.62 rows=13 width=1699) (actual time=1.320..1.329 rows=211 loops=1)
Sort Key: maestro_step_instance.step_id COLLATE "C", maestro_step_instance.workflow_run_id DESC, maestro_step_instance.step_attempt_id DESC
Sort Method: quicksort Memory: 410kB
-> Index Scan using maestro_step_instance_pkey on maestro_step_instance (cost=0.42..56.35 rows=13 width=1699) (actual time=0.028..0.254 rows=211 loops=1)
Index Cond: ((workflow_id = '<redacted>_large_demo'::text) AND (workflow_instance_id = 1))
Planning Time: 0.346 ms
Execution Time: 1.567 ms
(11 rows)
Query always hits the primary key index on (workflow_id, workflow_instance_id)
akashdw
reviewed
Apr 9, 2026
| value = "/api/v3/workflows", | ||
| produces = MediaType.APPLICATION_JSON_VALUE, | ||
| consumes = MediaType.APPLICATION_JSON_VALUE) | ||
| @SuppressWarnings("PMD.AvoidDuplicateLiterals") |
Collaborator
There was a problem hiding this comment.
Can you clarify why this is needed?
Collaborator
Author
There was a problem hiding this comment.
I added this here because without this we will have to use constant for 'workflowId' and 'workflowInstanceId' in line 98 and 99 below
@Valid @NotNull @PathVariable("workflowId") String workflowId
Previously this file has a few such strings but with my new endpoint it crossed over the PMD threshold
Similar pattern is used in other controllers
akashdw
approved these changes
Apr 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request type
./gradlew build --write-locksto refresh dependencies)NOTE: Please remember to run
./gradlew spotlessApplyto fix any format violations.Changes in this PR
Add endpoint to get latest step instance views across all runs for a workflow instance
Adds
GET /{workflowId}/instances/{workflowInstanceId}/stepswhich returns the most recent step attempt per step across all runs, useful for workflows restarted from failure where steps ran in different runs.Why is this needed
This endpoint makes it easy to get a snapshot of the current state of all steps in a workflow instance, regardless of how many times it has been restarted.
Currently, to understand the final state of each step you need to know the latest run ID and call the run-specific
/runs/{runId}/stepsendpoint. But for restarted workflows, different steps may have completed in different runs — some steps succeed early and are skipped in subsequent runs.The
GET /{workflowId}/instances/{workflowInstanceId}/stepsendpoint added in this PR solves this by querying across all runs and returns the most recent attempt per step, giving a complete and accurate view of the workflow instance's step states in a single call.Example
Say you have a workflow with 3 steps: step-a, step-b, step-c.
Run 1 — all 3 steps ran, but step-c failed:
Run 2 — restarted from failure, only step-c ran again:
GET /instances/1/runs/2/steps(existing) — incomplete, only shows steps from run 2:GET /instances/1/steps(new) — complete picture, latest attempt per step across all runs:Testing
DAO (
MaestroStepInstanceDaoTest):testGetAllStepInstanceViews— simulates a restart-from-failure scenario: run 1 has two steps (job1, job2), run 2 only re-ran job1. Verifies that job1 is returned from run 2 and job2 from run 1 — i.e. the most recent attempt per step is correctly selectedacross runs.
Controller (
StepInstanceControllerTest):testGetAllStepInstanceViews— verifies the DAO is called with correct arguments and the result is sorted bystepInstanceId.Locally: