-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Add retries to inngest #11495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retries to inngest #11495
Conversation
…etries-to-inngest
…etries-to-inngest
🦋 Changeset detectedLatest commit: ec174dc The changes in this PR will be included in the next version bump. This PR includes changesets to release 18 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
WalkthroughThis PR adds support for per-step retries and scorers across all Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🚨 Redirect Validation FailedThe redirect validation found issues in Action Required: Review and fix the redirect configuration. |
…/mastra into add-retries-to-inngest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
workflows/inngest/src/index.ts (1)
115-132: Useparams.idinstead ofparams.namefor agent step identifier.Line 123 uses
params.namefor the stepid, while the evented workflow and other agent/tool step definitions useparams.id. Since the Agent class initializesidto default tonameif not provided,params.idis the correct and consistent identifier to use.
🧹 Nitpick comments (5)
.changeset/add-inngest-step-retries-scorers.md (2)
6-8: Clarify what "scorers" do for developers unfamiliar with the feature.The term "AI evaluation support" is vague. Since "scorers" may not be familiar to all developers, briefly explain their purpose. Consider replacing "AI evaluation support" with "scorer-based step evaluation" or a sentence like "Scorers enable you to evaluate and rate step outputs."
Additionally, the description could be more concise: "parameters" is clearer than "fields" in this context.
🔎 Suggested revision
- The `createStep` function now includes support for the `retries` and `scorers` fields across all step creation patterns, enabling step-level retry configuration and AI evaluation support for regular steps, agent-based steps, and tool-based steps. + The `createStep` function now includes support for the `retries` and `scorers` parameters across all step creation patterns—regular, agent-based, and tool-based steps. Use `retries` to specify per-step retry attempts on failure, and `scorers` to attach evaluators that rate step outputs.
45-49: Consider emphasizing the practical benefit for developers.The closing paragraph confirms API consistency and backwards compatibility, which is good. You could strengthen the value proposition by briefly highlighting why this matters to developers (e.g., "enables fine-grained control over retries without changing workflow-level defaults").
🔎 Suggested revision
- This change ensures API consistency across all `createStep` overloads. All step types now support retry and evaluation configurations. + This change ensures API consistency across all `createStep` overloads. You can now override workflow-level retries and attach step-specific scorers without affecting other steps in your workflow.packages/core/src/workflows/workflow.test.ts (1)
10674-10676: Remove inline instructional comments.The inline comments on lines 10674 ("// Change to toMatchObject"), 10676 ("// REMOVE THIS LINE"), and 10681-10682 ("// ADD THIS SEPARATE ASSERTION...") appear to be development or review notes. Please remove these before merging to keep the codebase clean.
🔎 Proposed cleanup
expect(result.steps.step2).toMatchObject({ - // Change to toMatchObject status: 'failed', - // error: err?.stack ?? err, // REMOVE THIS LINE payload: { result: 'success' }, startedAt: expect.any(Number), endedAt: expect.any(Number), }); - // ADD THIS SEPARATE ASSERTION - error is now an Error instance expect((result.steps.step2 as any)?.error).toBeInstanceOf(Error); expect((result.steps.step2 as any)?.error.message).toMatch(/Step failed/);Also applies to: 10681-10682
workflows/inngest/src/execution-engine.ts (2)
87-132: Consider using Inngest's durable sleep for retry delays.The current implementation uses
setTimeoutfor retry delays (line 89), which is an in-memory delay. If the Inngest function is replayed or re-executed, this delay won't be durable and could result in immediate retries on function replay.Consider using
this.inngestStep.sleep()for durable delays that survive function restarts:for (let i = 0; i < params.retries + 1; i++) { if (i > 0 && params.delay) { - await new Promise(resolve => setTimeout(resolve, params.delay)); + await this.inngestStep.sleep(`${stepId}.retry.${i}`, params.delay); }This would make retry delays durable across Inngest function replays.
149-162: Update outdated comment.The comment on lines 151-152 still references
retryConfigandRetryAfterError, but these are no longer used since retry handling moved toexecuteStepWithRetry.🔎 Suggested fix
/** * Wrap durable operations in Inngest step.run() for durability. - * If retryConfig is provided, throws RetryAfterError INSIDE step.run() to trigger - * Inngest's step-level retry mechanism (not function-level retry). + * Retries are now handled externally by executeStepWithRetry. */ async wrapDurableOperation<T>(operationId: string, operationFn: () => Promise<T>): Promise<T> {
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
.changeset/add-inngest-step-retries-scorers.mdpackages/core/src/workflows/evented/evented-workflow.test.tspackages/core/src/workflows/evented/workflow-event-processor/index.tspackages/core/src/workflows/evented/workflow.tspackages/core/src/workflows/workflow.test.tspackages/core/src/workflows/workflow.tsworkflows/inngest/src/execution-engine.tsworkflows/inngest/src/index.test.tsworkflows/inngest/src/index.tsworkflows/inngest/src/workflow.ts
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Run
pnpm typecheckto validate TypeScript types across all packages
Files:
workflows/inngest/src/workflow.tspackages/core/src/workflows/evented/workflow-event-processor/index.tspackages/core/src/workflows/evented/workflow.tspackages/core/src/workflows/evented/evented-workflow.test.tsworkflows/inngest/src/index.test.tsworkflows/inngest/src/execution-engine.tsworkflows/inngest/src/index.tspackages/core/src/workflows/workflow.test.tspackages/core/src/workflows/workflow.ts
**/*.{ts,tsx,js,jsx,json,md}
📄 CodeRabbit inference engine (CLAUDE.md)
Run
pnpm prettier:formatto format code andpnpm formatto run linting with auto-fix across all packages
Files:
workflows/inngest/src/workflow.tspackages/core/src/workflows/evented/workflow-event-processor/index.tspackages/core/src/workflows/evented/workflow.tspackages/core/src/workflows/evented/evented-workflow.test.tsworkflows/inngest/src/index.test.tsworkflows/inngest/src/execution-engine.tsworkflows/inngest/src/index.tspackages/core/src/workflows/workflow.test.tspackages/core/src/workflows/workflow.ts
packages/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
packages/**/*.{ts,tsx}: All packages must use TypeScript with strict type checking enabled
Use telemetry decorators for observability across components
Files:
packages/core/src/workflows/evented/workflow-event-processor/index.tspackages/core/src/workflows/evented/workflow.tspackages/core/src/workflows/evented/evented-workflow.test.tspackages/core/src/workflows/workflow.test.tspackages/core/src/workflows/workflow.ts
.changeset/*.md
⚙️ CodeRabbit configuration file
.changeset/*.md: Changeset files are really important for keeping track of changes in the project. They'll be used to generate release notes and inform users about updates.Review the changeset file according to these guidelines:
- The target audience are developers
- Write short, direct sentences that anyone can understand. Avoid commit messages, technical jargon, and acronyms. Use action-oriented verbs (Added, Fixed, Improved, Deprecated, Removed)
- Avoid generic phrases like "Update code", "Miscellaneous improvements", or "Bug fixes"
- Highlight outcomes! What does change for the end user? Do not focus on internal implementation details
- Add context like links to issues or PRs when relevant
- If the change is a breaking change or is adding a new feature, ensure that a code example is provided. This code example should show the public API usage (the before and after). Do not show code examples of internal implementation details.
- Keep the formatting easy-to-read and scannable. If necessary, use bullet points or multiple paragraphs (Use bold text as the heading for these sections, do not use markdown headings).
- For larger, more substantial changes, also answer the "Why" behind the changes
- Each changeset file contains a YAML frontmatter at the top. It will be one or more package names followed by a colon and the type of change (patch, minor, major). Do not modify this frontmatter. Check that the description inside the changeset file only applies to the packages listed in the frontmatter. Do not allow descriptions that mention changes to packages not listed in the frontmatter. In these cases, the user must create a separate changeset file for those packages.
Files:
.changeset/add-inngest-step-retries-scorers.md
**/*.{test,spec}.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{test,spec}.{ts,tsx}: Use Vitest for testing framework in test files
Co-locate test files with source code using naming patterns like *.test.ts or *.spec.ts
Files:
packages/core/src/workflows/evented/evented-workflow.test.tsworkflows/inngest/src/index.test.tspackages/core/src/workflows/workflow.test.ts
🧠 Learnings (5)
📚 Learning: 2025-12-30T15:02:58.132Z
Learnt from: taofeeq-deru
Repo: mastra-ai/mastra PR: 11486
File: packages/core/src/loop/workflows/agentic-execution/llm-execution-step.ts:643-678
Timestamp: 2025-12-30T15:02:58.132Z
Learning: In `packages/core/src/loop/workflows/agentic-execution/llm-execution-step.ts`, the system guarantees that `metadata.suspendedTools` and `metadata.pendingToolApprovals` never contain resumed tools, so no filtering is needed when using metadata directly.
Applied to files:
packages/core/src/workflows/evented/workflow.tspackages/core/src/workflows/workflow.ts
📚 Learning: 2025-11-24T16:42:04.244Z
Learnt from: CR
Repo: mastra-ai/mastra PR: 0
File: packages/codemod/AGENTS.md:0-0
Timestamp: 2025-11-24T16:42:04.244Z
Learning: Applies to packages/codemod/src/test/**/*.test.ts : Include test cases for multiple occurrences of the transformation pattern, aliased imports, type imports, and mixed imports to verify the codemod works consistently
Applied to files:
packages/core/src/workflows/evented/evented-workflow.test.tsworkflows/inngest/src/index.test.tspackages/core/src/workflows/workflow.test.ts
📚 Learning: 2025-11-24T16:42:04.244Z
Learnt from: CR
Repo: mastra-ai/mastra PR: 0
File: packages/codemod/AGENTS.md:0-0
Timestamp: 2025-11-24T16:42:04.244Z
Learning: Applies to packages/codemod/src/test/__fixtures__/**/*.ts : In output fixtures, ensure all NEGATIVE test cases remain EXACTLY IDENTICAL to their input fixture counterparts to verify the codemod only transforms intended patterns
Applied to files:
packages/core/src/workflows/evented/evented-workflow.test.tsworkflows/inngest/src/index.test.ts
📚 Learning: 2025-12-18T21:52:52.667Z
Learnt from: taofeeq-deru
Repo: mastra-ai/mastra PR: 11276
File: packages/core/src/workflows/evented/evented-workflow.test.ts:940-1055
Timestamp: 2025-12-18T21:52:52.667Z
Learning: In packages/core Workflows tests, Run.stream returns a synchronous object (with fullStream and result), so awaiting run.stream() is not required in tests.
Applied to files:
packages/core/src/workflows/evented/evented-workflow.test.ts
📚 Learning: 2025-11-24T16:42:04.244Z
Learnt from: CR
Repo: mastra-ai/mastra PR: 0
File: packages/codemod/AGENTS.md:0-0
Timestamp: 2025-11-24T16:42:04.244Z
Learning: Applies to packages/codemod/src/test/__fixtures__/**/*.ts : In input fixtures, include both POSITIVE test cases (patterns that should transform) and NEGATIVE test cases (unrelated code with similar names/patterns that should NOT transform)
Applied to files:
workflows/inngest/src/index.test.ts
🧬 Code graph analysis (6)
packages/core/src/workflows/evented/workflow.ts (3)
packages/core/src/workflows/types.ts (2)
StepParams(466-491)ToolStep(493-503)packages/core/src/workflows/step.ts (1)
Step(74-101)packages/core/src/evals/base.ts (1)
MastraScorers(684-684)
packages/core/src/workflows/evented/evented-workflow.test.ts (1)
packages/core/src/workflows/evented/workflow.ts (2)
createStep(119-275)createWorkflow(277-308)
workflows/inngest/src/index.test.ts (2)
packages/core/src/workflows/evented/workflow.ts (1)
createStep(119-275)packages/core/src/workflows/workflow.ts (1)
createStep(184-831)
workflows/inngest/src/execution-engine.ts (2)
packages/core/src/error/index.ts (1)
getErrorFromUnknown(4-4)packages/core/src/error/utils.ts (1)
getErrorFromUnknown(45-122)
workflows/inngest/src/index.ts (3)
packages/core/src/evals/base.ts (1)
MastraScorers(684-684)packages/core/src/workflows/step.ts (1)
Step(74-101)packages/core/src/tools/tool.ts (1)
Tool(60-287)
packages/core/src/workflows/workflow.ts (3)
packages/core/src/evals/base.ts (1)
MastraScorers(684-684)packages/core/src/workflows/step.ts (1)
Step(74-101)packages/core/src/workflows/types.ts (1)
DefaultEngineType(374-374)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Validate build outputs
- GitHub Check: Lint
- GitHub Check: test
- GitHub Check: test (express)
- GitHub Check: test (hono)
- GitHub Check: test
- GitHub Check: test
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (17)
workflows/inngest/src/workflow.ts (1)
190-198: No issues found - architecture correctly implements Inngest step-level retry pattern.The code properly handles all error scenarios. Setting function-level
retries: 0and implementing retry logic at the step level is the correct approach for Inngest v3.44.3, which applies function-level retries per-step anyway. When the workflow fails, the code explicitly throwsNonRetriableError(workflow.ts line 265) to signal Inngest to stop retrying—this is the correct pattern and prevents both orchestration errors and final failures from being silently retried. TheexecuteStepWithRetryimplementation uses Inngest's standardRetryAfterErrorthrown insidestep.run()for step-level retries, and error propagation is properly handled.Likely an incorrect or invalid review comment.
packages/core/src/workflows/evented/workflow-event-processor/index.ts (1)
881-882: Per-step retry logic is correct; no cross-engine verification needed.The retry calculation properly prioritizes step-level
retriesover workflow-levelretryConfig.attempts. TheretryCount >= retriescondition is correct: withretries=3, retry attempts progress throughretryCountvalues 0, 1, 2, then 3, giving 4 total attempts (initial + 3 retries), which is the expected behavior.The evented processor implements its own event-driven retry mechanism independent of
executeStepWithRetry—it publishesworkflow.step.runevents via pubsub when retries remain (line 913). Thestep.step.retriesproperty is properly typed asnumber | undefinedacross all step definitions. No additional verification needed.packages/core/src/workflows/evented/evented-workflow.test.ts (1)
5720-5785: Per-step retry override test looks correct and aligned with semanticsThis test accurately exercises the new behavior where
step.retriestakes precedence overworkflow.retryConfig.attempts, and the expectations on execution counts and hydrated error instances are consistent with the existing retry tests and intended API.workflows/inngest/src/index.test.ts (1)
4600-4673: Per-step retry override test is clear and aligned with the new behaviorThis test cleanly validates that
retrieson the step takes precedence overretryConfig.attemptsat the workflow level and also confirms that successful steps aren’t retried unnecessarily. The setup mirrors existing retry tests and should integrate smoothly with the rest of the suite. No changes needed..changeset/add-inngest-step-retries-scorers.md (1)
1-4: LGTM! The frontmatter scope is correct; the description applies only to the listed packages (@mastra/inngestand@mastra/core). Code examples are clear, and the non-breaking change guarantee is well-communicated.packages/core/src/workflows/workflow.test.ts (1)
10624-10686: Test correctly validates per-step retry override behavior.The test structure and logic are sound. It properly verifies that step-level retries (5) take precedence over workflow-level retry config (10 attempts) by asserting 6 total calls (5 retries + 1 initial) instead of the 11 calls that would occur if the workflow config were used. The use of
toMatchObjectfor the step2 assertion appropriately handles the Error instance comparison.packages/core/src/workflows/evented/workflow.ts (3)
89-91: LGTM!The
createStepoverload correctly usesStepParamstype and maintains type safety for the step creation path.
102-106: LGTM!The overload signatures properly extend support for
retriesandscorersoptions across Agent, Tool, and general parameter paths. The type definitions are consistent with theStepinterface that includes these optional fields.Also applies to: 115-117, 127-135
136-136: LGTM!The implementation correctly:
- Uses
instanceofchecks for Agent and Tool detection- Propagates
retriesandscorersfrom the options to the returned Step object in all code pathsAlso applies to: 150-151, 212-212, 224-225, 272-273
workflows/inngest/src/index.ts (3)
36-40: LGTM!The overload signatures are well-structured and consistent with the evented workflow implementation. The
retriesandscorersoptions are properly typed usingDynamicArgument<MastraScorers>.Also applies to: 60-63, 74-77, 91-99
246-261: LGTM!The Tool path correctly extracts and propagates
retriesandscorersfromtoolOptsto the created step.
294-305: LGTM!The fallback path for
StepParamscorrectly propagatesretriesandscorersfrom the params object.packages/core/src/workflows/workflow.ts (5)
70-83: LGTM!Adding
'scorers'to the omitted properties inAgentStepOptionsis correct—it prevents conflicts between agent-level and step-level scorer configuration, ensuring step-levelscorersfromagentOptionstakes precedence.
133-137: LGTM!The overload signatures are consistent with the Inngest and evented workflow implementations, properly supporting
retriesandscorersoptions across all step creation paths.Also applies to: 157-157, 168-168, 197-205
208-227: LGTM!The Agent path correctly:
- Casts options with proper type annotation
- Extracts
retriesandscorersvia destructuring- Passes remaining options to agent methods
- Propagates
retriesandscorersto the returned step
363-379: LGTM!The Tool path correctly handles the options extraction and propagation of
retriesandscorers.
819-831: LGTM!The final step creation path correctly includes
retriesandscorersfrom theparamsobject, completing the coverage across allcreateStepcode paths.
## Description <!-- Provide a brief description of the changes in this PR --> Add support for `retries` and `scorers` parameters across all `createStep` overloads in all workflow engines ## Related Issue(s) <!-- Link to the issue(s) this PR addresses, using hashtag notation: mastra-ai#123 --> [mastra-ai#9351](mastra-ai#9351) ## Type of Change - [ ] Bug fix (non-breaking change that fixes an issue) - [x] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [x] Test update ## Checklist - [ ] I have made corresponding changes to the documentation (if applicable) - [x] I have added tests that prove my fix is effective or that my feature works <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added per-step retry configuration that overrides workflow-level retry settings, providing finer control over individual step failure handling. * Enabled scorer configuration across all step types (agent-based, tool-based, and standard steps) for AI-based evaluation. * Both enhancements are fully backward compatible; existing steps remain unchanged. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: intojhanurag <aojharaj2004@gmail.com> Co-authored-by: dane-ai-mastra[bot] <199668360+dane-ai-mastra[bot]@users.noreply.github.com>
## Description Fixes 11 failing tests in the Inngest workflow package caused by recent changes to error handling, type improvements, and timing-sensitive test design. What Was Broken | Issue | Tests Affected | Root Cause | |-------|----------------|------------| | Custom error properties lost during serialization | 2 | Inngest serializer drops non-allowlisted properties | | Step execute spy assertions failing | 4 | `.bind()` breaks Vitest spy tracking | | Module mocks not intercepting calls | 3 | Import captured before mock applied | | Cron schedule tests flaky | 2 | Race condition between test and cron registration | --- Details 1. Custom Error Properties Lost During Serialization PR #11495 simplified error handling in `wrapDurableOperation` to rethrow errors directly. However, Inngest's internal serializer (`serialize-error-cjs`) only preserves a hardcoded allowlist of properties (`message`, `name`, `stack`, `code`, `cause`). Custom properties like `statusCode`, `responseHeaders`, and `isRetryable` were silently dropped. 2. Step Execute Spy Assertions Failing Commit 69136e7 added `.bind(params)` to execute functions in `createStepFromParams()`. This created a wrapper function that broke Vitest's spy tracking, causing `toHaveBeenCalledTimes()` assertions to fail with `TypeError: [Function bound Mock] is not a spy`. 3. Module Mocks Not Intercepting Calls The `serve.ts` module imports `inngestServeHono` at load time, capturing the real function before `vi.mock()` could replace it. Tests asserting on the mock were actually calling the real implementation. 4. Cron Schedule Tests Flaky Tests scheduled a cron for 1 minute in the future, but Inngest dev server sync delays sometimes caused the scheduled minute to pass before registration completed. --- Fixes Applied | Issue | Fix | |-------|-----| | Error serialization | Restored error wrapping pattern using `cause` object, which is in the serialization allowlist and preserves custom properties via `toJSON()` | | Spy references | Store reference to original mock before passing to `createStep()`, assert on that reference | | Module mocks | Use `vi.hoisted()` to create mocks before module imports execute | | Cron timing | Changed to every-minute schedule (`* * * * *`) with polling loop | ## Related Issue(s) <!-- Link to the issue(s) this PR addresses, using hashtag notation: Fixes #123 --> ## Type of Change - [x] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [x] Test update ## Checklist - [ ] I have made corresponding changes to the documentation (if applicable) - [x] I have added tests that prove my fix is effective or that my feature works <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Improved error handling so failures preserve custom error details and include serialized context and timestamps for more reliable diagnostics. * **Tests** * Reworked tests to use shared, reusable executor and server mocks. * Replaced brittle time-based waits in scheduling tests with polling-based verification and adjusted timeouts for reliability. * **Chores** * Added a patch changeset documenting the serialization fix. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Ward Peeters <ward@coding-tech.com>
Description
Add support for
retriesandscorersparameters across allcreateStepoverloads in all workflow enginesRelated Issue(s)
#9351
Type of Change
Checklist
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.