Skip to content

feat: WAIT_FOR_WEBHOOK system task — full implementation (epic #888)#894

Draft
nthmost-orkes wants to merge 8 commits into
conductor-oss:mainfrom
nthmost-orkes:feat/wait-for-webhook-foundation
Draft

feat: WAIT_FOR_WEBHOOK system task — full implementation (epic #888)#894
nthmost-orkes wants to merge 8 commits into
conductor-oss:mainfrom
nthmost-orkes:feat/wait-for-webhook-foundation

Conversation

@nthmost-orkes
Copy link
Copy Markdown
Contributor

@nthmost-orkes nthmost-orkes commented Mar 19, 2026

Summary

Implements the complete `WAIT_FOR_WEBHOOK` system task for Conductor OSS (epic #888).

Originally planned as 6 separate PRs, consolidated into a single branch since the pieces are not independently functional. A precision pass and adversarial code review were done after the initial implementation, folded in here.

  • Task foundation — `TaskType.WAIT_FOR_WEBHOOK`, `WaitForWebhookTaskMapper`, `WaitForWebhookTask` (start/execute/cancel), `WebhookTaskDAO` interface + `InMemoryWebhookTaskDAO`
  • Webhook config CRUD — `WebhookConfig` model (ported from Orkes, no orgId), `WebhookConfigDAO` interface + `InMemoryWebhookConfigDAO`, `WebhookConfigService`, `WebhooksConfigResource` (`/api/metadata/webhook`)
  • Hash routing — `WebhookHashingService` with two-mode hash computation; hash format identical to Orkes Enterprise
  • Verification — `WebhookVerifier` interface, `HeaderBasedVerifier`; extensible via Spring beans
  • Inbound endpoint — `IncomingWebhookService` (synchronous, no queue), `IncomingWebhookResource` (`POST /webhook/{id}`, `GET /webhook/{id}`)
  • `workflowsToStart` — starts new workflow instances on inbound events
  • Redis persistence — `RedisWebhookTaskDAO` + `RedisWebhookConfigDAO` for multi-node deployments
  • Tests — 15 unit tests

Orkes Enterprise convergence

Designed as a drop-in replacement for the Orkes Enterprise `WAIT_FOR_WEBHOOK` implementation. The `WebhookConfig`, `WebhookTaskDAO`, and `WebhookConfigDAO` interfaces live in `conductor-common` so Orkes can depend on them without pulling in the full `webhook-task` module.

Verified identical to Orkes Enterprise:

  • Hash format `workflowName;version;taskRefName;sortedValues` and delimiter `;`
  • Sort order (TreeSet on JSONPath keys), JSONArray handling, wildcard detection (`$` prefix)
  • DO_WHILE `__N` suffix stripping at task-registration time
  • Matcher index key format — identical to `PostgresWebhookDAO.createMatchers()`
  • `urlVerified` and `secretValue` preservation on update
  • Secret masking (`"***"`) on `getWebhook(id)` and `getAllWebhooks()`
  • DAO interfaces have no orgId params — isolation is the implementation's concern
  • `orgContextProvider.applyContext(id)` called before any DAO access
  • Task output: payload keys spread directly into `outputData` via `putAll()`
  • DAO put happens before `setStatus(IN_PROGRESS)` so a DAO failure leaves task status unchanged
  • `isAsync()` returns `false`
  • Null matcher criteria guarded with explicit skip
  • `GET /webhook/{id}`: null ping response now dispatches inline (GET events delivered as query parameters are routed to waiting tasks and `workflowsToStart`)
  • `POST /webhook/{id}`: config-not-found returns HTTP 200 (empty body) to prevent provider retry storms

What the Orkes adapter layer needs to supply (see `orkes-io/orkes-conductor#3535`):

  1. `OrkesWebhookOrgContextProvider implements WebhookOrgContextProvider`
  2. `OrkesRedisWebhookTaskDAO implements WebhookTaskDAO`
  3. `OrkesPostgresWebhookConfigDAO implements WebhookConfigDAO`

Correctness fixes (adversarial review)

  1. GET ping verification gap — `handlePing` null branch now calls `verifier.verify()` before dispatching; previously unauthenticated GET requests could trigger dispatch.
  2. Transaction ordering — removed redundant `webhookTaskDAO.remove()` in `completeTask`; `popAll` already removed the task ID, the stale remove masked `updateTask` failures.
  3. Early verifier validation — `WebhooksConfigResource` rejects configs at creation time if no verifier implementation is registered for the type (e.g. STRIPE, TWITTER).
  4. Payload key collision warning — `parsePayload` logs WARN when a query parameter overwrites a body field of the same name.
  5. Redis `popAll` atomicity — `RedisWebhookTaskDAO` overrides `popAll` with SMEMBERS+DEL, narrowing the race window vs. the default SMEMBERS + N×SREM.

Known issues — deferred (documented in #888)

Yellow: Hash normalization asymmetry

Registration hash appends expected value (e.g. `"PENDING"`); inbound hash appends extracted value (e.g. `"pending"`). Tasks with case-differing match values silently never complete. Fix requires coordinated OSS + Orkes deploy with a migration window for in-flight tasks.

Red: Hash delimiter collision

Routing key uses unescaped `;` as delimiter. Workflow names or task ref names containing `;` produce ambiguous hashes. Fix requires data migration of all `WEBHOOK_TASKS.` keys + coordinated Orkes deploy. Low probability in practice.

Intentional differences from Orkes Enterprise

Synchronous vs async processing

OSS completes matching tasks inline within the HTTP request. Orkes uses `_webhook_queue` + `WebhookWorker`. OSS approach is correct for single-node deployments; use `RedisWebhookTaskDAO` for multi-node.

`POST /webhook/{id}` — body is optional

OSS defaults to `"{}"` when body is absent. Orkes requires a body. More permissive intentionally — some providers (Stripe, GitHub) send POST pings with no body.

Test plan

  • `./gradlew :conductor-webhook-task:test` — 15 unit tests pass
  • `./gradlew :conductor-redis-persistence:compileJava` — compiles cleanly
  • Create a webhook config via `POST /api/metadata/webhook`
  • Define a workflow with a `WAIT_FOR_WEBHOOK` task and `matches` input
  • Start workflow — verify task parks `IN_PROGRESS`
  • `POST /webhook/{id}` with matching payload — verify task completes and workflow advances
  • `POST /webhook/{id}` with non-matching payload — verify no tasks complete
  • Test `workflowsToStart` — verify new workflow instances start on event
  • Test `HEADER_BASED` verifier — verify bad secret returns 400

🤖 Generated with Claude Code

nthmost-orkes and others added 5 commits March 19, 2026 01:20
… executing loop tasks

Fixes conductor-oss#876. When a DO_WHILE task uses list iteration (via the `items` field or
`_items` inputParameter) and the evaluated list is empty, the task previously
scheduled and executed iteration 1 unconditionally. Now it short-circuits to
COMPLETED before scheduling any loop tasks.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…output key cleanup off-by-one

Two follow-up corrections to the empty-items DO_WHILE fix:

1. Add `addOutput("iteration", 0)` before completing with empty items list,
   so downstream expressions like `${doWhileTask.output.iteration}` get 0
   instead of null.

2. Fix `keepLastN` in-memory output key cleanup: `IntStream.range(0, N-k-1)`
   was generating keys "0".."N-k-2" (mostly non-existent), leaving old
   iteration output in memory. Corrected to `IntStream.rangeClosed(1, N-k)`
   which matches the DB cleanup in `removeIterations()`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… cleanup

- testExecute_NonEmptyItemsList_SchedulesFirstIteration: ensures the
  empty-items early-exit doesn't accidentally block non-empty lists

- testKeepLastN_OutputCleanup_RemovesMultipleKeys: exercises
  iteration=5, keepLastN=2 to confirm "1","2","3" are all removed and
  "4","5" are retained (the previous test only removed 1 key)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…, PR 1)

Introduces the module structure and core lifecycle classes for
WAIT_FOR_WEBHOOK — a system task that pauses a workflow until a
matching inbound webhook event arrives.

This PR establishes the foundation:
- TaskType.WAIT_FOR_WEBHOOK enum value + string constant
- WaitForWebhookTaskMapper (core): resolves input, creates task IN_PROGRESS
- webhook-task module: WaitForWebhookTask system task, WebhookTaskDAO
  interface, InMemoryWebhookTaskDAO default implementation
- Build wiring: settings.gradle, server/build.gradle
- 24 unit tests covering mapper, task lifecycle, and DAO operations
- Example workflow: order_payment_workflow.json

WebhookTaskDAO and WebhookVerifier are defined as Spring-pluggable
interfaces so Orkes Enterprise can substitute durable backends
(Postgres, Redis) without modifying OSS code.

Subsequent PRs add: config CRUD (conductor-oss#889), hash computation (conductor-oss#890),
inbound REST endpoint (conductor-oss#892), workflowsToStart (conductor-oss#891), docs (conductor-oss#893).

Closes part of conductor-oss#888.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nthmost-orkes nthmost-orkes force-pushed the feat/wait-for-webhook-foundation branch from feac8b3 to a66d94f Compare March 19, 2026 20:20
@nthmost-orkes nthmost-orkes changed the title feat: WAIT_FOR_WEBHOOK foundation — task type, mapper, system task, DAO (epic #888 PR 1) feat: WAIT_FOR_WEBHOOK system task — full implementation (epic #888) Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant