Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special Handling for new workflow replication #7561

Merged
merged 9 commits into from
Apr 11, 2025

Conversation

xwduan
Copy link
Contributor

@xwduan xwduan commented Apr 3, 2025

What changed?

Add special handling for new workflow replication

Why?

To reduce passive side loadmutablestate not found attempt

How did you test it?

unit test

Potential risks

no risk

Documentation

n/a

Is hotfix candidate?

no

}
}

wfCtx, releaseFn, err := r.workflowCache.GetOrCreateWorkflowExecution(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use current workflow lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetOrCreateWorkflowExecution does not return the workflow context which is necessary parameter for ndc createWorkflow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Looks like normal start workflow (manually creates the workflowContext object) and replication start workflow follows a different approach.


if mutation != nil && mutation.ExclusiveStartVersionedTransition.TransitionCount == 0 {
// this is the first replication task for this workflow
// TODO: Handle reset case to reduce the amount of history events write
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @meiliang86 FYI

err = r.taskRefresher.Refresh(ctx, localMutableState)

if err != nil {
println(err.Error())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; remove

//nolint:revive // cognitive complexity 35 (> max enabled 25)
func (r *WorkflowStateReplicatorImpl) handleFirstReplicationTask(
ctx context.Context,
versionedTransition *replicationspb.VersionedTransitionArtifact,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: artifact maybe?

return err
}

localMutableState.SetHistoryBuilder(historybuilder.NewImmutable(historyEventBatchs...))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this update the HistorySize in ExecutionStats?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is handled here:

newSnapshot.ExecutionInfo.ExecutionStats.HistorySize += int64(newHistoryDiff.SizeDiff)

SyncWorkflowStateMutationAttributes: &replicationspb.SyncWorkflowStateMutationAttributes{
StateMutation: mutation,
ExclusiveStartVersionedTransition: &persistencespb.VersionedTransition{
NamespaceFailoverVersion: taskVersionedTransition.NamespaceFailoverVersion,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is for making sure workflow.TransitionHistoryStalenessCheck(localTransitionHistory, mutation.ExclusiveStartVersionedTransition) != nil check can pass when applying mutation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. so we don't need special handling for that.

}
}

wfCtx, releaseFn, err := r.workflowCache.GetOrCreateWorkflowExecution(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Looks like normal start workflow (manually creates the workflowContext object) and replication start workflow follows a different approach.

@xwduan xwduan marked this pull request as ready for review April 11, 2025 18:05
@xwduan xwduan requested a review from a team as a code owner April 11, 2025 18:05
@xwduan xwduan enabled auto-merge (squash) April 11, 2025 18:05
@xwduan xwduan merged commit 722e086 into main Apr 11, 2025
50 checks passed
@xwduan xwduan deleted the will/fix_state_based_new_workflow_replication branch April 11, 2025 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants