-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[CRE-406] (fix): WF Registry Syncer update #17608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -444,12 +445,6 @@ func (h *eventHandler) createWorkflowSpec(ctx context.Context, payload WorkflowR | |||
return nil, err | |||
} | |||
|
|||
// Calculate the hash of the binary and config files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to tryEngineCreate
(line 702)
@@ -459,11 +454,6 @@ func (h *eventHandler) createWorkflowSpec(ctx context.Context, payload WorkflowR | |||
} | |||
} | |||
|
|||
// Pre-check: verify that the workflowID matches; if it doesn't abort and log an error via Beholder. | |||
if !types.WorkflowID(hash).Equal(payload.WorkflowID) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to tryEngineCreate
(line 710)
@@ -234,13 +234,26 @@ func (h *Store) GetSecrets(ctx context.Context, secretsURL string, workflowID [3 | |||
return nil, fmt.Errorf("failed to fetch secrets from %s : %w", secretsURL, fetchErr) | |||
} | |||
|
|||
// sanity check by decoding the secrets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved check to separate method below ValidateSecrets
. This is called in handler.go
func tryEngineCreate
(line 715)
Flakeguard SummaryRan new or updated tests between View Flaky Detector Details | Compare Changes Found Flaky Tests ❌1 Results
ArtifactsFor detailed logs of the failed tests, please refer to the artifact failed-test-results-with-logs.json. |
…reating DB entry to avoid refetching
…ion test framework
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handler code LGTM!
if err := h.tryEngineCleanup(payload.WorkflowOwner, payload.WorkflowName); err != nil { | ||
return err | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess this is the fix to the bug. we were not cleaning up first, but updating the workflow store, hence cleaning up the old engine instance later is impossible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@@ -147,12 +169,16 @@ func (n ComputeFetcherFactory) NewFetcher(log commonlogger.Logger, emitter custm | |||
} | |||
} | |||
|
|||
func setupDons(ctx context.Context, t *testing.T, lggr logger.SugaredLogger, workflowURL string, cronSchedule string, | |||
triggerFactory framework.TriggerFactory) *data_feeds_cache.DataFeedsCache { | |||
func setupDons(ctx context.Context, t *testing.T, lggr logger.SugaredLogger, workflowURL string, cronSchedule string, cronSchedule2 string, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit this signature is a bit opaque. consider returning a named interface of hooks that happen at during execution points (or name the return values)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair. I'm going to punt on a refactor like that for now, as this is a hot fix that needs to go out.
@@ -68,7 +68,7 @@ func Test_FetchTrueUSDWorkflow(t *testing.T) { | |||
framework.CreateWasmBinary(t, mainFile, wasmFile) | |||
|
|||
triggerSink := framework.NewTriggerSink(t, "cron-trigger", "1.0.0") | |||
dataFeedsCacheContract := setupDons(ctx, t, lggr, wasmFile, "*/5 * * * * *", triggerSink) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if I understand the existing POR test has been 'hijacked' to test workflow update, fine for the hot fix, but could this be followed up with a test to explicity test workflow update that is seperate from the POR test? (reusing where necessary)
Description
Fixes an issue with WF Registry Syncer updates: during updating a workflow, the old engine could never be shut down.
This happens because the logic was changed to no longer first clean up the previous engine before upserting the new DB record. During an engine shutting down, the config for the engine is retrieved. Because the DB entry is modified to the new workflow before shutting down the old one, the old engine's config couldn't be found anymore, leaving it stuck.
This fix restores removing the old engine before upserting the new DB record.
IMPACT:
This means that there will be a small downtime toward updating workflows (which is the previous behavior before the refactor). Zero downtime will be handled in another feature.
Also covers:
CAPPL-804
This issue didn't show in unit testing. Adds integration testing to protect against further regressions. The integration test now includes updating the workflow.
CAPPL-811
Moves validations to after the DB entry is saved. Currently, failure of these validations would lead to us repeatedly refetching the secrets, even though these errors would normally be unrecoverable.