-
Notifications
You must be signed in to change notification settings - Fork 1k
CHASM: Non-Workflow Mutable State P1 #7595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
if ms.IsWorkflowExecutionRunning() { | ||
// Do NOT use ms.IsWorkflowExecutionRunning() for the check. | ||
// Zombie workflow is not considered running but also not closed. | ||
if ms.executionState.State != enumsspb.WORKFLOW_EXECUTION_STATE_COMPLETED { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to double check all the callers. This is breaking ndc workflow right now, which may call it when workflow is in zombie state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guaranteeing caller not calling this when workflow is zombie will require a big refactoring. For now, I special handled zombie case and returns LastWriteVersion instead.
transactionPolicy historyi.TransactionPolicy, | ||
workflowEventsSeq []*persistence.WorkflowEvents, | ||
) { | ||
if transactionPolicy != historyi.TransactionPolicyActive { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we replicating the vector clock from active to passive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the LastRunningClock (was LastEventTaskID), is part of executionInfo and won't be sanitized during replication. We only check the clock when the failover version are the same, so we know the clock is from the same cluster and comparable.
if ms.executionInfo.VersionHistories != nil { | ||
return ms.currentVersion | ||
} | ||
|
||
if ms.transitionHistoryEnabled && len(ms.executionInfo.TransitionHistory) != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we check this first before checking versionhistories?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's not necessary. I guess those checks exists only because we want to support very very old workflows in DB which is created even before VersionHistory & xdc replication is a thing.
So for almost all workflows, VersionHistories will not a nil and the logic will just return ms.currentVersion
.
I added those new logic just to be consistent with we already have, but TBH I don't really expect the logic to ever reach them...
What changed?
Changes are mainly on replication side:
Why?
How did you test it?
Potential risks
Documentation
Is hotfix candidate?