Skip to content

Add application err category #1925

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

THardy98
Copy link

What was changed

Upgrade api-go to 1.48.0
Added support for ApplicationErrorCategory, allowing users to specify benign application errors.

Why?

Part of the benign exceptions work.

  1. Closes Apply application failure logging and metrics behaviour according to ApplicationErrorCategory #1908

  2. How was this tested:
    Modify existing failure conversion tests.

  3. Any docs updates needed?
    Not sure

@THardy98 THardy98 requested a review from a team as a code owner April 22, 2025 00:47
@THardy98 THardy98 force-pushed the add_application_err_category branch from dd61c43 to 8700e2a Compare April 22, 2025 00:50
@THardy98 THardy98 force-pushed the add_application_err_category branch 6 times, most recently from 9522536 to 49536fa Compare April 22, 2025 03:12
@THardy98 THardy98 force-pushed the add_application_err_category branch from 49536fa to 388c9ba Compare April 22, 2025 03:58
@@ -137,6 +140,7 @@ type (
//
// NOTE: This option is supported by Temporal Server >= v1.24.2 older version will ignore this value.
NextRetryDelay time.Duration
Category ApplicationErrorCategory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs docs

"Benign failure",
"",
temporal.ApplicationErrorOptions{
Category: internal.ErrorCategoryBenign,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test should not refer to internal types, you need to export ErrorCategoryBenign

@@ -137,6 +140,7 @@ type (
//
// NOTE: This option is supported by Temporal Server >= v1.24.2 older version will ignore this value.
NextRetryDelay time.Duration
Category ApplicationErrorCategory
Copy link
Contributor

@Quinn-With-Two-Ns Quinn-With-Two-Ns Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the release status of temporalio/features#614? Is it GA?

incrementWorkflowTaskFailureCounter(metricsHandler, failureReason)
completedRequest = failWorkflowTask
wtp.logger.Warn("Failed to process workflow task.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the goal of this change?

@@ -119,6 +119,9 @@ Workflow consumers will get an instance of *WorkflowExecutionError. This error w
*/

type (
// Category of the error. Maps to logging/metrics behaviours.
ApplicationErrorCategory string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be exposed outside of internal

@@ -119,6 +119,9 @@ Workflow consumers will get an instance of *WorkflowExecutionError. This error w
*/

type (
// Category of the error. Maps to logging/metrics behaviours.
ApplicationErrorCategory string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm, I wonder if we really want this to be a string vs just a number (based on the proto number) and can have a String() on it. A string may make users think it can accept any string literal. Also other enumerates we traditionally do the iota approach. And if/when you do change to integer here, make sure there is an unspecified form representing 0.

@@ -380,6 +385,11 @@ var (
ErrMissingWorkflowID = errors.New("workflow ID is unset for Nexus operation")
)

const (
// ErrorCategoryBenign indicates an error that is expected under normal operation and should not trigger alerts.
ErrorCategoryBenign ApplicationErrorCategory = "BENIGN"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ErrorCategoryBenign ApplicationErrorCategory = "BENIGN"
ApplicationErrorCategoryBenign ApplicationErrorCategory = "BENIGN"

Need to qualify the const IMO

case "":
// Zero value maps to unspecified
return enumspb.APPLICATION_ERROR_CATEGORY_UNSPECIFIED
default:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If/when we change to integer enums we can/should just use the enum given to us in both directions since both Go and proto allow "open-ended" enums.

}
}

func IsBenignApplicationError(err error) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func IsBenignApplicationError(err error) bool {
func isBenignApplicationError(err error) bool {

No need to expose I assume unless it's a helper we really need to expose, but I don't believe we do. A user can easily check for any value on application error without a helper.

ts.Error(err)
var appErr *temporal.ApplicationError
ts.True(errors.As(err, &appErr))
ts.False(internal.IsBenignApplicationError(err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to avoid using internal in integration tests. Users can't use it.

@@ -474,19 +474,21 @@ func (wtp *workflowTaskPoller) RespondTaskCompletedWithMetrics(
) (response *workflowservice.RespondWorkflowTaskCompletedResponse, err error) {
metricsHandler := wtp.metricsHandler.WithTags(metrics.WorkflowTags(task.WorkflowType.GetName()))
if taskErr != nil {
wtp.logger.Warn("Failed to process workflow task.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are any of the changes made inside of this method needed?

@@ -119,6 +119,9 @@ Workflow consumers will get an instance of *WorkflowExecutionError. This error w
*/

type (
// Category of the error. Maps to logging/metrics behaviours.
ApplicationErrorCategory string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is ApplicationErrorCategory intended to be an enum?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Apply application failure logging and metrics behaviour according to ApplicationErrorCategory
3 participants