Skip to content

Log duplicated activity events #6813

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

fimanishi
Copy link
Member

What changed?
Added logs to capture duplicated activity events

Why?
There were some cases where started and completed activity events were duplicated, causing the workflows to fail. Couldn't reproduce it locally, so adding logs to try to identify it earlier and debug it.

How did you test it?
Tested locally

Potential risks

Release notes

Documentation Changes

@@ -2253,6 +2261,65 @@ func (e *mutableStateBuilder) logDataInconsistency() {
tag.WorkflowRunID(runID),
)
}
func (e *mutableStateBuilder) logDuplicatedActivityEvents() {
type activityTaskUniqueEventParams struct {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't compare events directly. Using this struct to dereference attributes that can guarantee the activity uniqueness

Copy link
Member

@davidporter-id-au davidporter-id-au Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this will work, equality in maps is being compared by pointer, not by value ( I think, let's confirm I'm not lying). Let's start with a unit test to validate this is working as expected. I think this is will probably not catch duplicates in its current state, unless you squash it into something that's more comparable like a string.

Ensuring that log is called is a bit annoying, so changing the check to just return a bool is one easy solution. eg:


  isDuplicated := e.checkDuplicatedActivityEvents()
  if isDuplicated{
    e.logger.Error("Duplicate activity task event found",
				tag.WorkflowDomainName(e.GetDomainEntry().GetInfo().Name),
				tag.WorkflowID(e.GetExecutionInfo().WorkflowID),
				tag.WorkflowRunID(e.GetExecutionInfo().RunID),
				tag.WorkflowScheduleID(scheduledEventID),
				tag.WorkflowEventType(event.GetEventType().String())
    )

  }
  })

func (e *mutableStateBuilder) checkDuplicatedActivityEvents() bool {
   ... // check if the fields are duplicated 
}

For the duplicated check, my vague memory was that you were looking to check if the scheduledEventID was duplicated? That's just a primitive type, I think that's possible to just do a map check on? I would start there to begin with, though if you want to check other fields, there's a few solutions, but i'm not entirely sure what it means for the attempt to be duplicated, for example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually seems I am a bit wrong and clearly don't understand map keys well enough

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, you're correct and I'm wrong (https://go.dev/ref/spec#Comparison_operators), (demonstrated here)

Struct types are comparable if all their field types are comparable. Two struct values are equal if their corresponding non-blank field values are equal. The fields are compared in source order, and comparison stops as soon as two field values differ (or all fields have been compared).

So I take back what I said, but would agree with Tim's point about a unit test, both to check and to also catch NPE problems when dereferencing

@@ -2253,6 +2261,65 @@ func (e *mutableStateBuilder) logDataInconsistency() {
tag.WorkflowRunID(runID),
)
}
func (e *mutableStateBuilder) logDuplicatedActivityEvents() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we have some unit test on those cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll add that to show how it works

@fimanishi fimanishi merged commit a155e42 into cadence-workflow:master Apr 14, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants