feat(webhooks-oss): WAIT_FOR_WEBHOOK end-to-end — port + persistence across 5 backings#1106
feat(webhooks-oss): WAIT_FOR_WEBHOOK end-to-end — port + persistence across 5 backings#1106nthmost-orkes wants to merge 80 commits into
Conversation
Lifts the new webhooks-oss/ module created by orkes-io/orkes-conductor#3612 (scheduler-style OSS/enterprise split) into conductor-oss. All io.orkes.conductor.* packages translated to org.conductoross.conductor.* per repo convention. Replaces the prior webhook-task/ module (deleted on main, only stale build/ remnants existed) with a structurally faithful copy of webhooks-oss from orkes-conductor. Ported - 17 main + 7 test files comprising the webhooks-oss module (verifiers, hashing, IncomingWebhookService, WebhookTaskMapper, WebhookWorkerProperties, etc.) - 10 supporting classes (EventMessage, ErrorList, Tag, WebhookConfig, IncomingWebhookEvent, WebhookExecutionHistory, WebhookDAO, WebhookTaskService, TargetWorkflowCollector, TimeBasedUUIDGenerator) into common/ and core/ Code rewrites beyond mechanical namespace translation 1. ApplicationException -> NonTransientException (2 sites in IncomingWebhookService, 3 sites in TimeBasedUUIDGenerator). OSS retired ApplicationException. 2. TimeBasedUUIDGenerator stripped of multi-tenant OrkesRequestContext lookups. generate() still produces real time-based UUIDs via log4j UuidUtil. generate(long) (test-only, unused in OSS) falls back to UUID.randomUUID() with TODO. getOrgId() returns "_". 3. EventMessage: removed orgId field and List<ExtendedEventExecution> eventExecutions field (transitive class not ported; orgId is tenant context). 4. IncomingWebhookService: replaced two executionDAOFacade.addEventMessage() calls with log.warn(). OSS's ExecutionDAOFacade has no addEventMessage (Orkes-only audit feature). Rejected events still observable via logs; audit-table persistence is a future enhancement. 5. Deleted 3 postgres DAO tests (AbstractTestDAO, PostgresDAOTestUtil, PostgresWebhookTaskServiceTest). They exercise a PostgresWebhookTaskService impl that has not been ported yet. 6. Inlined FeatureFlags.SECURITY as literal "conductor.security.enabled" in 5 verifier tests. Avoided porting a 20-line constants class for one string reference. 7. TargetWorkflowCollectorTest: org.apache.groovy.util.Maps.of(...) -> java.util.Map.of(...). Same semantics. 8. WebhookTaskMapper relocated from com.netflix.conductor.core.execution.mapper to org.conductoross.conductor.tasks.webhook (per repo namespace convention). Added explicit imports for TaskMapper and TaskMapperContext. 9. webhooks-oss/build.gradle adds commons-lang3, spring-web (test), and spring-boot-starter-web (compileOnly) for the at-compile-time imports. Wiring - settings.gradle: include 'webhooks-oss' - server/build.gradle: implementation project(':conductor-webhooks-oss') Verified - ./gradlew :conductor-webhooks-oss:compileJava + compileTestJava: clean - ./gradlew :conductor-webhooks-oss:test: all green - ./gradlew :conductor-server:compileJava: clean (full server tree) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d webhook Adds the runtime pieces needed to receive a webhook over HTTP and store it in the queue for later dispatch. The dispatch worker and config-CRUD API land in the next iteration. Added - InMemoryWebhookDAO: lifted from the prior feat/webhook-foundation attempt, extended to satisfy the richer WebhookDAO interface ported from orkes. Annotated @component @ConditionalOnMissingBean(name= "webhookDAO") so a persistent impl can override. - InMemoryWebhookTaskService: lifted similarly, with @component @ConditionalOnMissingBean(name="webhookTaskService"). - IncomingWebhookResource: tenant-free OSS port of the webhooks-enterprise REST controller. POST /webhook/{id} and GET /webhook/{id}. ~60 lines. Code rewrite - IncomingWebhookService: swapped constructor-injected TimeBasedUUIDGenerator for IDGenerator (already a @bean in ConductorCoreConfiguration). Removes the @ConditionalOnProperty wiring trap and avoids needing conductor.id.generator=time_based. Webhook event IDs are now random UUIDs rather than time-based UUIDs, which is fine for the queue path that uses them. Matchers note InMemoryWebhookDAO.createMatchers/getMatchers/removeMatchers are no-ops. Orkes' impl computes matchers by reading WorkflowDefs and extracting inputParameters.matches from WAIT_FOR_WEBHOOK tasks, which requires MetadataDAO injection. Inert until a worker consumes them, so deferred with an inline comment. Verified - :conductor-webhooks-oss:compileJava + compileTestJava: clean - :conductor-webhooks-oss:test: all green - :conductor-server:compileJava: clean Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…CRUD Adds the management surface so webhook configs can be registered before they receive events. Combined with the previous commit's REST receive endpoint, this is the full register-then-receive arc. Added - WebhookConfigService: tenant-free OSS port of orkes' enterprise service. CRUD over WebhookDAO, calls TargetWorkflowCollector to populate matchers, sanitizes secrets on read. No audit logging (AuditUtils is enterprise-only); no FeatureFlags conditional (always enabled in OSS). - WebhookConfigResource: REST controller mounted at /api/metadata/webhook (mirrors orkes path). POST/PUT/GET/DELETE + GET-all. No security annotations (@PreAuthorize / @PostFilter are enterprise-only); no tag endpoints (TagsService is enterprise). Validation rules preserved. Code rewrites vs orkes - ApplicationException(CONFLICT, ...) -> ConflictException - ApplicationException(NOT_FOUND, ...) -> NotFoundException - ApplicationException(INVALID_INPUT, ...) -> NonTransientException - TimeBasedUUIDGenerator -> IDGenerator (consistent with prior commit) - Dropped AuditUtils, FeatureFlags.WEBHOOKS conditional, all security annotations, all tag endpoints - Dropped createdBy = OrkesAuthentication.getAuthenticatedUser().getId() since OSS has no authenticated-user concept here Verified - :conductor-webhooks-oss:compileJava + compileTestJava: clean - :conductor-webhooks-oss:test: all green - :conductor-server:compileJava: clean Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…atch
Completes the end-to-end webhook arc. An OSS server can now:
1. Register a webhook config (POST /api/metadata/webhook)
2. Receive an incoming webhook (POST /webhook/{id})
3. Verify the signature
4. Store the event + enqueue
5. The worker polls the queue, matches against waiting WAIT_FOR_WEBHOOK
tasks via the hash service, completes those tasks, and starts any
workflows declared in workflowsToStart.
Added
- WebhookWorker: OSS port of the Orkes Enterprise WebhookWorker. Plain
@component with @PostConstruct-started ScheduledExecutorService and
@PreDestroy shutdown (no LifecycleAwareComponent base, no
MetricsCollector / Monitors, no OrkesRequestContext orgId switching,
no ExtendedEventExecution audit bookkeeping). Failures log instead
of persist. ~250 lines.
Changes
- InMemoryWebhookDAO.createMatchers now actually computes matchers:
injects MetadataDAO, reads WorkflowDef by name+version, finds
WAIT_FOR_WEBHOOK / WAIT tasks, extracts inputParameters.matches and
keys them by workflowName;version;taskRef. Mirrors orkes'
PostgresWebhookDAO.createMatchers minus the orgId prefix and
workflowDefToUpdateMap cache.
Code rewrites vs orkes WebhookWorker
- LifecycleAwareComponent -> plain @component with @PostConstruct /
@PreDestroy
- MetricsCollector, Monitors.error -> log statements
- OrkesRequestContext.setOrgId per message -> dropped
- recordEventExecution(...) (writes ExtendedEventExecution audit
records via ExecutionDAOFacade.updateEventExecution) -> log
statements. ExtendedEventExecution / ExtendedEventHandler are not
ported to OSS.
- executionDAOFacade.addEventMessage(...) -> dropped (method doesn't
exist in OSS; same call previously stubbed in IncomingWebhookService)
- orkesRedisExecutionDAO.getTask(taskId) -> executionDAOFacade.getTaskModel(taskId)
- WorkflowConsistency.DURABLE -> dropped (enum doesn't exist in OSS)
- ApplicationException -> dropped (we already log + throw via log line)
Verified
- :conductor-webhooks-oss:compileJava + compileTestJava: clean
- :conductor-webhooks-oss:test: green
- :conductor-server:compileJava: clean
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds unit-test coverage for the four OSS-original classes added in prior commits on this branch. None of these had tests previously. Added - WebhookConfigServiceTest (8 tests): id generation, conflict on existing id, fresh-id passthrough, secret sanitization on read, redacted-secret preservation on update, real-secret overwrite, TargetWorkflowCollector delegation, removeWebhook ordering. - InMemoryWebhookDAOTest (8 tests): CRUD for configs and events, plus the matchers computation logic added in the previous commit: null override stores empty, missing WorkflowDef skipped, real WAIT_FOR_WEBHOOK task with matches stored under workflowName;version;taskRef, task without matches skipped, non-webhook task skipped, removeMatchers drops. - WebhookConfigResourceTest (8 tests): validation (missing all three targets, HEADER_BASED without headers), valid delegation, path-id override on update, 404 on missing get/delete, secret sanitization on get. - WebhookWorkerTest (6 tests): event-not-found and config-not-found early returns, workflowsToStart invokes WorkflowExecutor with the right StartWorkflowInput (name, version, input merge, event name, createdBy), non-integer version skipped, matcher hit completes the waiting WAIT_FOR_WEBHOOK task and removes it from the task service, terminal task is not re-completed. Change - WebhookWorker.handleMessage made package-private so the test can invoke it directly without driving the ScheduledExecutorService. Verified - :conductor-webhooks-oss:test passes 62 tests, 0 failures Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the last untested link in the integration chain. When a workflow reaches a WAIT_FOR_WEBHOOK task, the Webhook WorkflowSystemTask's start() must register the task with WebhookTaskService so the worker can find it via hash later. - WebhookTest (1 test): verifies start() calls webhookTaskService.put(taskModel, workflow.getWorkflowVersion()) and sets status to IN_PROGRESS. Verified: :conductor-webhooks-oss:test 63/63 pass.
Exercises the full webhooks-oss bean graph end-to-end with only the WorkflowExecutor, QueueDAO, ExecutionDAOFacade, MetadataDAO, and ParametersUtils mocked. Validates the integration story without needing a full @SpringBootTest (which would require a circular dep on test-harness). Added - WebhooksOssEndToEndTest (4 tests): - register_then_receive_storesEventAndPushesQueue: full POST path from REST controller through config DAO → IncomingWebhookService → verification → event store → queue push. - receive_unknownWebhookId_returnsNullAndNoEnqueue: idempotent-safe drop when no config is registered. - register_workflow_completion_path_end_to_end: the load-bearing one. Wires WorkflowDef → matchers computation → system-task put() → receive → worker.handleMessage → workflowExecutor.updateTask with status=COMPLETED. Validates the full task-completion chain. - receive_failedVerification_throwsAndNoEnqueue: verifier rejects, NonTransientException thrown, no queue push. Wires real instances of InMemoryWebhookDAO, InMemoryWebhookTaskService, TargetWorkflowCollector, WebhookConfigService, IncomingWebhookService, IncomingWebhookResource, WebhookHashingService, WebhookWorker, and Webhook. Uses a stub verifier that rejects bodies starting with "REJECT" to drive the failure path deterministically. Total tests on this branch now 67 across 9 test classes, 0 failures.
… server Started server-lite, registered a webhook config, delivered an HMAC-signed event, watched it dispatch into a workflow that completed. Found and fixed four issues along the way. Bug 1: REST routes missing /api prefix IncomingWebhookResource mounted at /webhook; OSS convention is /api/... (see RequestMappingConstants.API_PREFIX). Without the prefix, requests fell through to the static resource handler and returned 500. Fixed: /webhook -> /api/webhook. Bug 2: @PathVariable name inference Bare @PathVariable String id failed at runtime with "parameter name information not available via reflection. Ensure that the compiler uses the '-parameters' flag." OSS controllers explicitly name each binding (see e.g. MetadataResource). Fixed: all @PathVariable in IncomingWebhookResource and WebhookConfigResource now use explicit ("id") names. Bug 3: Mutation of stored WebhookConfig on read WebhookConfigService.getWebhooks() and WebhookConfigResource.getWebhook() mutated the returned config's secretValue in place to redact it. With the in-memory DAO that returns shared references, this corrupted the persisted secret. Next webhook delivery read secretValue=*** and HMAC verification failed with "Illegal base64 character 2a" trying to decode the *** placeholder. Fix: WebhookConfig gained toBuilder=true; both sites now clone-then-redact instead of mutating. Added regression assertions in WebhookConfigServiceTest and WebhookConfigResourceTest. Bug 4: server-lite did not depend on webhooks-oss Only server/build.gradle pulled in :conductor-webhooks-oss. server-lite is the lighter local-dev target; users running it via :conductor-server-lite:bootRun would not get the webhooks module loaded at all. Fixed: added the dep alongside the other system tasks (http-task, json-jq-task, kafka). Verified end-to-end against running server 1. POST /api/metadata/webhook -> 200, returns config with generated id 2. GET /api/metadata/webhook/{id} -> 200, secret redacted to *** 3. POST /api/metadata/workflow -> 200, register wf-smoke v1 4. POST /api/webhook/{id} with HMAC-SHA256 signed body -> 200 5. WebhookWorker dequeued the event, invoked WorkflowExecutor.startWorkflow, wf-smoke v1 ran to COMPLETED in 655ms Tests: 68 across 9 classes, 0 failures.
…ct bugs, dead code
Adversarial review surfaced three blockers before this PR can flip from
draft to reviewable. All addressed in this commit.
License headers — would have failed spotlessJavaCheck on first CI build
42 files reformatted by spotlessApply. Every "Copyright 2022 Orkes, Inc."
+ "Orkes Enterprise License" header now Apache 2.0 / Conductor Authors.
TimeBasedUUIDGenerator had no header at all — fixed. Import ordering
normalized across the affected modules per repo Spotless config.
WebhookConfigService.updateWebhook NPE on missing id
PUT /api/metadata/webhook/{id} with an id that doesn't exist would
dereference null and 500 instead of 404. Added an explicit
NotFoundException at the top of the method, matching the pattern in
WebhookConfigResource.getWebhook + deleteWebhook. Regression test added.
IncomingWebhookService.handleWebhook 200/null masquerade for unknown id
POST /api/webhook/{id} to an unregistered webhook id returned HTTP 200
with empty body — looked successful to the caller. Now throws
NotFoundException so the global exception mapper returns 404. Updated
WebhooksOssEndToEndTest accordingly.
Dead code
- IncomingWebhookService no longer builds EventMessage objects that
were discarded immediately (left over from the addEventMessage ->
log.warn rewrite). 28 lines removed.
- Dropped unused ExecutionDAOFacade dependency — constructor down from
5 to 4 args.
- Dropped unused EventMessage / DEAD_LETTER_QUEUE imports.
Verified
- :conductor-webhooks-oss:compileJava + compileTestJava: clean
- :conductor-webhooks-oss:spotlessCheck: clean
- :conductor-webhooks-oss:test: 69 tests / 0 failures
…dversarial review Addresses the should-fix items surfaced by the pre-review adversarial pass. WebhookTaskMapper relocated for OSS convention org.conductoross.conductor.tasks.webhook.WebhookTaskMapper -> org.conductoross.conductor.webhook.tasks.mapper.WebhookTaskMapper Matches the AI module's precedent (org.conductoross.conductor.ai.tasks.mapper.*) and collocates the mapper with the rest of the webhook code instead of splitting it across two top-level packages. Dead code in TimeBasedUUIDGenerator Dropped generate(long) and getOrgId(String) — both had TODO stubs with no OSS callers (verified by grep). Net: 16 lines removed, generate() gets an @OverRide annotation, three "OSS-OrkesRequestContext-removed" TODOs gone. The remaining methods (generate() and getDate(String)) are the only ones reachable in OSS. Build.gradle hygiene - compileOnly 'spring-boot-starter-web' -> implementation: the module's REST controllers use @RestController/@RequestMapping etc. at runtime, not just compile time. compileOnly only worked because :conductor-server happens to pull starter-web transitively; module would be non-functional if consumed standalone. - Same upgrade for spring-boot-autoconfigure (used by @ConditionalOnMissingBean). - stripe-java 29.4.0 and sendgrid-java 4.10.2 hardcoded -> bound to new revStripe and revSendgrid in dependencies.gradle. Added a PINNED comment explaining they're webhook-verifier-specific clients with no shared rev. - Dropped now-orphan test deps: spring-web (no longer needed after IncomingWebhookService dropped HttpHeaders test usage indirectly), postgres-persistence, testcontainers (postgresql + base), HikariCP, duplicate spring-retry. The DAO tests these supported moved to postgres-persistence in the parent split PR. WebhookWorker.recordHistory off-by-one if (hist.size() > lastRunWorkflowIdSize) -> >=. Previous form let the list grow to N+1 before trimming, contradicting the property name. Missing test coverage - IncomingWebhookResourceTest (2 tests): delegation + return value passthrough for both handleWebhook and handlePing endpoints. - InMemoryWebhookTaskServiceTest (6 tests): put/get hash semantics, missing matches throws, bucketing of multiple tasks at the same hash, remove preserves vs drops bucket, unknown hash returns empty. WebhookConfigServiceTest extended with the NotFoundException-on-missing-id regression (mirrors the contract fix in round 1). Verified - :conductor-webhooks-oss:compileJava + compileTestJava clean - :conductor-webhooks-oss:spotlessCheck clean - :conductor-webhooks-oss:test 76 tests / 0 failures - :conductor-server:compileJava clean
…utation Two production-impacting bugs from the adversarial review. WebhookWorker.pollAndExecute no longer acks on failure Previously a try/finally acked unconditionally, so any exception in handleMessage (DB blip, OOM, NPE) silently dropped the webhook event. The TODO comment "retries are not yet modelled" was not a fix. Now: ack only after handleMessage returns normally. On failure, log and let the queue's unack timeout redeliver. Poison messages still get retried, but they hit the underlying QueueDAO impl's retry policy (e.g. postgres queue's max-retry → dead-letter) rather than vanishing. InMemoryWebhookDAO.createMatchers no longer caches stale criteria Previously createMatchers walked the WorkflowDefs at registration time and stored the extracted `matches` criteria. If a workflow def was later updated to change its WAIT_FOR_WEBHOOK task's matches inputParameter, getMatchers kept returning the old criteria until someone re-PUT the webhook config. Now: createMatchers stores only the receiverWorkflowNamesToVersions *targets* (which DO need to be captured at registration time so the expression-based override is preserved). getMatchers looks up the WorkflowDefs from MetadataDAO and extracts matches on every call. Slightly more expensive (one MetadataDAO read per workflow per webhook event), but always correct. Tests - InMemoryWebhookDAOTest gains getMatchers_reflectsWorkflowDefUpdates_noStaleCache: register matchers, assert criteria, swap the mock to return an updated WorkflowDef, assert getMatchers returns the new criteria without re-running createMatchers. Pins the regression. - 77 tests / 0 failures.
… pollAndExecute Per AGENTS.md preference: use real implementations over mocks. The prior WebhookWorkerTest was 100% mocks. Rewrote to wire real instances of InMemoryWebhookDAO, InMemoryWebhookTaskService, WebhookHashingService, and TargetWorkflowCollector. Only the deep infra is still mocked: QueueDAO, WorkflowExecutor, ExecutionDAOFacade, MetadataDAO, ParametersUtils. The full register→receive→dispatch flow remains covered by WebhooksOssEndToEndTest. This class now focuses on worker-internal semantics that don't surface end-to-end. Added pollAndExecute coverage Made pollAndExecute() package-private (was private — same change pattern already applied to handleMessage for the same reason). Four new tests: - pollAndExecute_emptyBatch_noop: empty pop, no ack called. - pollAndExecute_success_acks: happy path acks the message after handleMessage returns cleanly. - pollAndExecute_handleMessageThrows_doesNotAck: REGRESSION TEST. Prior impl had a try/finally that acked even on failure, silently dropping webhook events on any exception. Test uses an anonymous InMemoryWebhookDAO subclass that throws from getWebhookEvent to simulate a DB blip. - pollAndExecute_mixedBatch_acksOnlySuccesses: batch with one good + one poison message. Only the good one gets acked; the poison one is left for the queue's unack timeout to redeliver. Existing handleMessage tests rewired with real beans - handleMessage_eventNotFound_returnsEarly: stores nothing, calls handleMessage, expects no executor invocation. - handleMessage_configNotFound_returnsEarly: stores event but not config; asserts the event is NOT removed (so retry can happen). - handleMessage_workflowsToStart_invokesExecutor: registers config with workflowsToStart, asserts StartWorkflowInput contents (name, version, input merge, event name) and that the event is removed after success. - handleMessage_workflowsToStart_nonIntegerVersion_skipped: version is a string, no executor call. - handleMessage_matcherHit_completesWaitingTask: registers config + uses real WebhookTaskService.put to register a waiting task; asserts that webhookTaskService.get returns empty after handleMessage completes. - handleMessage_matcherHit_terminalTask_skipped: pre-COMPLETED task is not re-completed. Total: 81 tests across 11 classes, 0 failures.
Per project CLAUDE.md, new user-facing surfaces need source-derived docs.
The webhooks-oss module added /api/metadata/webhook (config CRUD) and
/api/webhook/{id} (event receive) without any documentation; users had no
on-ramp.
Adds docs/.../systemtasks/wait-for-webhook-task.md following the existing
system task doc pattern (wait-task.md template). Contents:
- Task type + how the dispatch chain works (mapper, hash, worker)
- inputParameters table (matches)
- WebhookConfig registration: endpoint, body, verifier types table
(HMAC_BASED, SIGNATURE_BASED, HEADER_BASED, SLACK_BASED, STRIPE,
TWITTER, SENDGRID — pulled from the actual Verifier enum)
- Delivery endpoint: status codes, response semantics
- Verified end-to-end curl example: same flow exercised during the
smoke test (register config, register workflow def, deliver HMAC-signed
event)
- conductor.webhook.worker.* properties table (pulled from
WebhookWorkerProperties.java)
- Failure semantics: verification failure / worker dispatch failure /
rejected event logging — matches the round-3 ack-only-on-success
worker change.
Registered in mkdocs.yml nav alongside wait-task.md.
…ntion coverage All 5 OSS persistence backends now have a retention mechanism for incoming_webhook_event: Cassandra (CassandraWebhookDAO) - Adds default_time_to_live to incoming_webhook_event at CREATE TABLE time. Cassandra auto-expires rows after the configured window — no scheduled job needed, no tombstone churn beyond what the SSTable layer already manages. - Default TTL: 7 days (matches the SQL cleanup-job default). - Configurable via conductor.webhooks.cleanup.retention-duration on the CassandraConfiguration @bean wiring (Duration → toSeconds for default_time_to_live). - Caveat documented inline: CREATE TABLE IF NOT EXISTS won't update an existing table's TTL; operators changing retention on an existing deployment need to ALTER TABLE manually. - The other 3 tables (webhook, webhook_target_workflows, webhook_hash_to_taskid) don't get TTL — they're durable config / task-routing state, not auditable events. Redis (RedisWebhookCleanupJob) - @component @conditional(AnyRedisCondition.class) @ConditionalOnProperty(conductor.webhooks.cleanup.enabled, matchIfMissing=true). - @scheduled walker on the same cron as the SQL backings: HGETALL on the WEBHOOK_EVENT hash, parse each value as IncomingWebhookEvent, HDEL any whose timeStamp is older than retentionDuration. - Cheap at low cardinality; documented that high-volume deployments should migrate to per-key TTL storage (HEXPIRE landed in Redis 7.4 but isn't ubiquitous yet). - 4 tests with testcontainers Redis: deletes old, keeps recent, empty hash no-op, all-recent kept, timeStamp=0 fixtures preserved. Verified locally: full compile clean across cassandra/redis modules and server. Tests not run locally — both need Docker. CI will validate. Deferred items now all addressed: - ~~Worker DLQ semantics~~ (in #1106) - ~~Matcher cache staleness~~ (in #1106 + #1110) - ~~Postgres impls~~ (#1110) - ~~MySQL/SQLite/Redis/Cassandra adapters~~ (#1110) - ~~WebhookCleanupJob (postgres)~~ (#1110) - ~~MySQL + SQLite cleanup~~ (#1110) - ~~Cassandra TTL + Redis cleanup~~ (this commit) Remaining open: EventMessage audit-table persistence (the addEventMessage → log.warn debt) — separate, larger architectural decision.
…ment Random (workflow, version, taskRef, matches, body) tuples assert the two hash sites in webhooks-oss canonicalize identically: InMemoryWebhookTaskService (registration) and WebhookHashingService.computeJsonHash (inbound). Catches silent divergence that would cause inbound events to miss registered tasks. 100 reps per direction (match / no-match). Seed defaults to System.nanoTime(); on failure the TestWatcher prints -Dwebhook.hash.seed=<n> for reproducibility.
… all verifiers For each of HMAC_BASED, SIGNATURE_BASED, HEADER_BASED, SLACK_BASED, STRIPE, TWITTER: register a workflow def with a WAIT_FOR_WEBHOOK task, register a webhook config with a fresh secret, start the workflow, fire a signed event via the emitter, and poll until COMPLETED. Verified 6/6 PASS against conductor-server-lite running on loki with this PR's webhooks-oss module (conductor-oss/conductor#1106). Per-verifier config quirks (HEADER_BASED headers map, SLACK_BASED urlVerified, STRIPE api_version, TWITTER header name) handled per-case from source. Not in conductor's CI — too slow/flaky for the main test suite. This is the manual / on-demand counterpart to the in-process tests in webhooks-oss.
|
Re-validated end-to-end: built conductor at |
Stripe events whose body omits api_version (older API payloads, synthetic test payloads, and apparently some real Stripe deliveries) NPE on rawApiVersion.split(...). The signature has already been verified at this point, and both branches of the apiVersion check ultimately return ErrorList.empty(), so a missing api_version is treated as "old api / no deserializer support" — same as the dated pre-deserializer path. Surfaced by the smoke harness firing synthetic Stripe-shaped bodies; saw 500 NPE in the verifier path. Adds StripeVerifierTest (3 cases): null api_version accepted, bad signature rejected, missing header rejected. Orkes has identical code in io.orkes.conductor.webhook.verifier.StripeVerifier — mirror fix needed upstream to keep parity.
Stacks N workflow instances on the same WAIT_FOR_WEBHOOK matcher, fires a single signed event, asserts all N reach COMPLETED. Exercises the InMemoryWebhookTaskService bucket logic from the queue/worker level — the unit test multiple_tasks_sameHash_bucketed pins it in-process; this proves it through the real HTTP + dispatch path. Verified against conductor-oss/conductor#1106 @ 38c40c5c3: N=20 → 20/20 in 0.67s (median 0.26s, max 0.67s) N=100 → 100/100 in 3.01s (median 2.19s, max 3.01s) HMAC_BASED only — per-verifier cross-cut stays in smoke.py.
Three new conductor-smoke scripts and one new emitter endpoint:
conductor-smoke/multi_verifier_smoke.py
Three webhook configs (HMAC, signature, header) all bound to the same
workflow as receivers. Each fire spawns a fresh workflow instance that
completes. Exercises matcher recomputation over overlapping receivers.
conductor-smoke/replay_smoke.py
Fires the same signed event twice — documentation-mode for current
behavior. Confirms: no inbound dedup, terminal-task-skip prevents
double-dispatch errors, bucket dispatches to all matching tasks.
conductor-smoke/dual_emitter_smoke.py
Two emitters on different hosts fire identically-signed events at the
same conductor at the same instant (threading.Event barrier). Tests
cross-network body fidelity and worker behavior under concurrent ingress.
scenarios.py / POST /scenarios/wait-for-webhook
Server-side bundle of register-workflow → register-webhook → start →
sign → fire → poll. Returns ok/final-status/per-phase timings. Lets
end users drive a full smoke from one curl, no Python driver needed.
Mounted optionally via include_router — fails open if scenarios.py
can't import.
Verified against conductor-oss/conductor#1106 @ 38c40c5c3:
- multi_verifier: 3/3 PASS
- replay: behaves as expected (no dedup, terminal skip)
- dual_emitter: PASS (mac@localhost:8767 + loki:8766 → loki:7001)
- /scenarios call: 200 OK, COMPLETED in 2.45s including 2.02s poll
…ervice Production multi-node backing for the WebhookDAO + WebhookTaskService interfaces. Designed clean for OSS (no orgId anywhere in the schema or queries) and applies the matcher-cache fix from the parent PR's Round 3: target workflows are persisted, match criteria are recomputed from MetadataDAO on every getMatchers() call so WorkflowDef updates take effect without re-registering the webhook. Stacked PR — base is feat/webhooks-from-orkes-split. Don't merge until the parent (#1106) lands. After parent merges this rebases onto main. Added - V16__webhook.sql: four tables — webhook, incoming_webhook_event, webhook_target_workflows, webhook_hash_to_taskid. Renamed webhook_matchers -> webhook_target_workflows to reflect that we store target workflow ids (override snapshot), NOT pre-computed matchers. Index on incoming_webhook_event(created_on) for the cleanup follow-up. - org.conductoross.conductor.postgres.dao.PostgresWebhookDAO: 9 interface methods + private computeMatchers helper. Extends PostgresBaseDAO. ON CONFLICT DO UPDATE for upserts. - org.conductoross.conductor.postgres.dao.PostgresWebhookTaskService: 3 interface methods (put/get/remove). ON CONFLICT DO NOTHING on put for idempotency. - PostgresWebhookDAOTest (8 tests with testcontainers + @MockBean MetadataDAO), PostgresWebhookTaskServiceTest (6 tests with testcontainers). Critical regression test: getMatchers_recomputesFromMetadataDAO_onEachCall_noStaleCache. Shared hash extraction - WebhookTaskHashing helper added to core (org.conductoross.conductor.service.webhook). Static methods that both InMemory and Postgres impls use, so tasks registered via one backing are findable by hash via any other. Validates matches before composing the hash (regression fix in this commit: NPE in removeIterationFromTaskRefName would mask the missing-matches exception). - InMemoryWebhookTaskService refactored to delegate to the helper (-50 LoC). Test surface unchanged. PostgresConfiguration wiring - Two new @bean methods with @dependsOn({"flywayForPrimaryDb"}). Bean names "webhookDAO" and "webhookTaskService" match the @ConditionalOnMissingBean(name=...) on the InMemory impls, so the postgres beans take precedence when this module is on the classpath. Always-on for now (no @ConditionalOnProperty); the whole postgres-persistence module is gated by conductor.db.type=postgres at the upstream configuration. OSS-side design changes vs orkes' PostgresWebhookDAO - All org_id columns and WHERE clauses dropped. Tables are tenant-free. - ConductorLimitsDAO/LimitUtils quota check dropped (Orkes-only). - Preconditions.checkNotNull replaced with NotFoundException where appropriate; nulls otherwise propagate per OSS convention. - Scheduled refreshMatchers executor dropped — matcher staleness is fixed by design (recompute on read). - workflowDefToUpdateMap cache dropped — same reason. - The "matches" persistence (webhook_matchers table in orkes) replaced with "targets" persistence (webhook_target_workflows). Different shape, same external contract. Verified locally - :conductor-postgres-persistence:compileJava + compileTestJava clean - :conductor-postgres-persistence:spotlessCheck clean - :conductor-webhooks-oss:test green (81 tests, hashing refactor doesn't break anything) - :conductor-server:compileJava clean Note: postgres tests need Docker daemon for testcontainers; not run locally. CI will validate.
Closes #1142 V5 seeded webhook_cleanup_lease.expires_at with a text literal ('1970-01-01T00:00:00'). The xerial sqlite-jdbc driver stores java.sql.Timestamp as INTEGER (epoch millis), and SQLite's type- comparison rule says INTEGER < TEXT — so SqliteWebhookCleanupJob's "expires_at < ?" check (with ? bound as a Timestamp) was always false on a freshly migrated DB. The cleanup job would log "lease held elsewhere" forever and never delete an incoming_webhook_event row. V5 hasn't shipped to main yet, so fix the seed value in place rather than adding a patch migration. Use INTEGER 0 so the seed's storage class matches what the cleanup job writes via setTimestamp(). The test's @before lease reset (added earlier in this PR for between- test isolation, since flyway.clean() leaks rows on the shared sqlite file) remains necessary — it covers a different concern. Trim its comments to drop the now-stale "V5 seed is broken" wording. Postgres and MySQL aren't affected: their native TIMESTAMP types parse the text literal as a real timestamp. No Orkes parallel — sqlite- persistence and the cleanup-lease pattern are both OSS-side additions.
Tag.java was introduced by the webhooks-oss port but serves no purpose in OSS: tags in Orkes power TagsService/AutomaticTagService-based access control, neither of which exists here. The class had no callers other than WebhookConfig.tags.
In Orkes, WebhookConfig.tags is populated at query-time by TagsService from a separate enterprise tags table. OSS has no TagsService, so the field was always null. Dead API surface that would mislead users.
EventMessage was added to common to mirror Orkes' webhook event audit trail, but OSS has no ExecutionDAO.addEventMessage() and IncomingWebhookService never references it. Zero callers anywhere in OSS — pure dead code. The event audit feature (DLQ persistence for rejected/unmatched webhook events) requires a separate architectural decision before it belongs here.
…ODO Remove WebhookExecutionHistory embeds a rolling execution log inside WebhookConfig and is only implemented in orkes-conductor/webhooks-enterprise. Both the Orkes and OSS models carried a '// TODO Remove this' comment. The recordHistory() call also held the only webhookDAO.createWebhook() write-back after dispatch, but urlVerified is already persisted by IncomingWebhookService before the event is enqueued, so that write was redundant. Drops lastRunWorkflowIdSize property and its associated test.
…getter Method was @JsonIgnore and only extracted keys from receiverWorkflowNamesToVersions, which is already accessible directly. No call site exists anywhere in OSS.
NonTransientException previously fell through to the default 500, which
told webhook senders (Stripe, GitHub, Slack) that the server was broken
and to retry. Signature verification failures are client errors — the
sender's signature is wrong — and must return 4xx so senders stop
retrying.
NonTransientException by semantics ("this won't work no matter how many
times you retry") maps cleanly to 400 Bad Request.
Verified: bad_signature and replay cases in negative_smoke.py now return
400 instead of 500, 5/5 negative cases passing on enki.
There was a problem hiding this comment.
I can't provide meaningful review of the code quality. Looks great to me on a quick scan over 11k lines. Claude thinks it's pretty good too, noting a few risky things but mentioning that they're documented.
I ran through an example scenario with the emitter as well as some manual testing with it and a manually defined wf, stuff worked.
Thanks, I appreciate you just trying the stuff out, for sure! |
…growth Ports orkes-io/orkes-conductor#3663. Webhook.start() does a put() into the WAIT_FOR_WEBHOOK hash set; the only removal path was WebhookWorker.handleEvent firing SREM on a matching event. Workflows terminated before a match arrived left their taskId in the set indefinitely — Redis memory leak / Postgres row accumulation in webhook_hash_to_taskid. Fix: add Webhook.cancel() that calls webhookTaskService.remove(TaskModel, int) on all six backings (postgres, mysql, redis, cassandra, sqlite, in-memory). WebhookTaskHashing.computeHashIfPresent() handles tasks cancelled before IN_PROGRESS (null/missing matches) as a safe no-op instead of throwing. Cleanup failures are caught and logged so the terminate path is never blocked.
Summary
Ships
WAIT_FOR_WEBHOOKend-to-end on OSS conductor: a faithful port ofwebhooks-oss/from orkes-io/orkes-conductor#3612, plus the in-memory + 5 persistent backings needed to actually run it in any OSS deployment shape, plus retention ofincoming_webhook_eventrows. Closes #888.User impact: First-time users can configure
WAIT_FOR_WEBHOOKworkflows, receive verified inbound events (HMAC/Slack/Stripe/Twitter/SendGrid/SignatureBased/HeaderBased), and have them dispatch into running workflows — on any supported OSS persistence shape (in-memory single-node, postgres, mysql, sqlite, redis, cassandra), not just dev mode.Review-round changes
Six commits added after initial filing, addressing Viren's review:
893a990Tag.javafromcommonTagsService/AutomaticTagServicedon't exist in OSS; the class had no callers other thanWebhookConfig.tagsb986d65List<Tag> tagsfromWebhookConfig762e4c2EventMessage.javafromcommonIncomingWebhookServicehas noaddEventMessagecalls and OSSExecutionDAOhas no such method. Decision (per Viren): DLQ doesn't fit Conductor's design;timeoutSecondsonWAIT_FOR_WEBHOOKtasks is the correct negative-path mechanism92e122dWebhookExecutionHistory+recordHistory()+lastRunWorkflowIdSize// TODO Remove thison this field. Enterprise-only execution log embedded in config (anti-pattern); only implemented inwebhooks-enterprise, notwebhooks-oss237701eWebhookConfig.getWorkflowNames()getReceiverWorkflowNamesToVersions().keySet()with@JsonIgnorea05447cNonTransientException→400inApplicationExceptionMapperNonTransientExceptionsemantically means "this won't work no matter how many retries" — correct HTTP mapping is 400 Bad Request5cdd389Webhook.cancel()+remove(TaskModel, int)on all 6 backingsWebhook.start()puts a taskId into the correlation hash set but there was no symmetric removal when a workflow was terminated before a matching event arrived — set grew without bound.cancel()callswebhookTaskService.remove(TaskModel, int)(new overload on all 6 backings);WebhookTaskHashing.computeHashIfPresent()handles tasks cancelled before IN_PROGRESS as a safe no-op.Orkes parity
Mirrors the layout from orkes-io/orkes-conductor#3612, which performed the
webhooks/→webhooks-oss/+webhooks-enterprise/split per the precedent set by scheduler in orkes-io/orkes-conductor#3529 and #1064.Source files mirrored:
orkes-conductor/webhooks-oss/src/main/java/io/orkes/conductor/webhook/**orkes-conductor/common/andorkes-conductor/core/orkes-conductor/{postgres,mysql,redis,cassandra}-persistence/All
io.orkes.conductor.*packages translated toorg.conductoross.conductor.*per repo convention.OSS-side additions (no Orkes parallel)
Required because Orkes' equivalents are tenant-aware or rely on infrastructure that doesn't exist in OSS:
InMemoryWebhookDAO+InMemoryWebhookTaskService—@ConditionalOnMissingBeandefaults for the single-node case. No Orkes parallel because Orkes always runs against persistent storage.WebhookConfigService+WebhookConfigResource— tenant-free CRUD atPOST/PUT/GET/DELETE /api/metadata/webhook. OSS equivalent of the enterprise resource that usesAuditUtils+OrkesAuthentication.IncomingWebhookResource— tenant-free port atPOST/GET /api/webhook/{id}. Drops orgId switching.WebhookWorker— OSS port of the enterprise worker. Plain@Componentwith@PostConstruct/@PreDestroy, noLifecycleAwareComponent, noMetricsCollector/Monitors, noOrkesRequestContext, noExtendedEventExecutionaudit bookkeeping. Failures log instead of persist.WebhookTaskHashing(incore) — hash computation shared across all 6WebhookTaskServiceimpls. Orkes carries the body inline in every impl; OSS lifts it once. Behavioral parity across backings is enforced by call, not by 5 hand-maintained copies. Also fixes a latent NPE (matches validated before refname iteration removal).WebhookMatcherComputer(incore) — same pattern for theWorkflowDef → matcherstransformation. All 6 DAO impls call it; eachgetMatchers()is a one-liner. Net –166 LoC vs. each impl carrying its own copy.WebhookVerifier.dedupKey(...)(default no-op) +WebhookDAO.tryRecordSignature(webhookId, signature, ttl)(default no-op).IncomingWebhookServicerejects signature replays inside a 5-minute window. Backings that don't implement dedup retain prior behavior. No Orkes parallel; Orkes has no replay protection.Backings landed
InMemoryWebhookDAO+InMemoryWebhookTaskServiceConcurrentHashMap(default; single-node)PostgresWebhookDAO+PostgresWebhookTaskServiceV16__webhook.sql(4 tables)PostgresWebhookCleanupJob—WITH ... DELETE ... RETURNINGMySQLWebhookDAO+MySQLWebhookTaskServiceV10__webhook.sql(4 tables, InnoDB/utf8mb4)MySQLWebhookCleanupJob—DELETE ... LIMIT NSqliteWebhookDAO+SqliteWebhookTaskServiceV4__webhook.sql(4 tables)SqliteWebhookCleanupJob—DELETE WHERE rowid IN (...)RedisWebhookDAO+RedisWebhookTaskServiceJedisProxyRedisWebhookCleanupJob—HGETALL+ parse +HDELwalkerCassandraWebhookDAO+CassandraWebhookTaskServiceensureTables()on constructiondefault_time_to_live(no separate job)Shared design across backings
matchescriteria are recomputed fromMetadataDAOon everygetMatchers()call. WorkflowDef updates take effect without re-registering the webhook. Regression covered per backing bygetMatchers_recomputesFromMetadataDAO_onEachCall_noStaleCache.WebhookDAObeans register aswebhookDAOand allWebhookTaskServicebeans aswebhookTaskService. The in-memory impls use@ConditionalOnMissingBean(name="webhookDAO"/"webhookTaskService")— whichever persistence module is on the classpath wins; with none, in-memory provides the default.conductor.webhooks.cleanup.enabled(defaulttrue)conductor.webhooks.cleanup.cron(default hourly"0 0 * * * *")conductor.webhooks.cleanup.retention-duration(defaultPT168H/ 7 days)conductor.webhooks.cleanup.batch-size(default1000) — SQL backings onlyconductor.webhooks.cleanup.max-runtime(defaultPT60S) — SQL backings onlyconductor.webhook.worker.{threadCount,pollingInterval,pollBatchSize}viaWebhookWorkerProperties.Design departures from Orkes' source impls
org_idanywhere — tables, queries, code paths. Orkes' schema bleedsorg_idinto 4 tables and dozens of WHERE clauses; OSS is tenant-free.PostgresWebhookDAOpersists pre-computed matchers and runs a scheduledrefreshMatchersexecutor to handle WorkflowDef updates. OSS drops the cache, recomputes on every read.ConductorLimitsDAOquota enforcement — Orkes enterprise-only.NotFoundExceptioninstead ofPreconditions.checkNotNullfor missing-id errors.webhook_matchers(Orkes) →webhook_target_workflows(OSS), since OSS persists target workflows, not pre-computed matchers.WebhookCleanupJobimplements an Orkes-internalDataCleanerinterface. OSS uses Spring@Scheduleddirectly with the property surface above.Code rewrites beyond mechanical namespace translation
Port-side (webhooks-oss)
IncomingWebhookService,TimeBasedUUIDGeneratorApplicationException→NonTransientExceptionTimeBasedUUIDGeneratorgenerate(long),getOrgId(String),OrkesRequestContextlookupsIncomingWebhookService.handleWebhookNotFoundExceptionon unknown webhook id instead of returningnull. DroppedExecutionDAOFacadedep.WebhookConfigService.updateWebhookNotFoundExceptionif existing config not foundWebhookConfigService.getWebhooks+WebhookConfigResource.getWebhooktoBuilder().secretValue(SECRET).build()instead of in-placesetSecretValue("***")IncomingWebhookResource,WebhookConfigResource/webhook→/api/webhook,/metadata/webhook→/api/metadata/webhook. All@PathVariableuse explicit("id").-parameters. Found via smoke.WebhookTaskMapperorg.conductoross.conductor.webhook.tasks.mapperWebhookConfigResource,WebhookConfigService@PreAuthorize/@PostFilter,OrkesAuthentication.getAuthenticatedUser. Tags (Tag.java,WebhookConfig.tags) fully removed — enterprise RBAC with no OSS backing.webhooks-oss/build.gradlespring-boot-starter-web+spring-boot-autoconfigureupgradedcompileOnly→implementation. Stripe/SendGrid pinned viarevStripe/revSendgrid.StripeVerifier.verifyevent.getApiVersion()before.split(...)api_version. Orkes has identical code — needs mirror fix to keep parity.SlackVerifierX-Slack-Signature: v0=<hmac_sha256(timestamp:body)>+X-Slack-Request-Timestampwith 5-minute replay window and constant-time compare.ErrorList.empty()onceurlVerified=true— effectively unauthenticated post-handshake. OSS does real signing-secret verification. Orkes is genuinely vulnerable here; mirror-fix candidate.ApplicationExceptionMapperNonTransientException→400 Bad RequestPersistence-side
core/WebhookTaskHashing.javavalidates matches before computing the hash. Fixes a latent NPE whereremoveIterationFromTaskRefName(null)would mask the missing-matchesNonTransientException. Regression:InMemoryWebhookTaskServiceTest.put_missingMatchesField_throws.core/WebhookMatcherComputer.javacentralizes theWorkflowDef → matchersloop. All 6 DAO impls + InMemory call it. Behavior unchanged from inline; eliminates 5 hand-maintained copies.PostgresWebhookDAOTest+ siblings setspring.flyway.ignore-migration-patterns=*:missingto tolerate an orphanV10.1in CI's shared testcontainers postgres (another test class applies V10.1 viaexperimental-queue-notify).CassandraWebhookDAOkeeps a localobjectMapper+toJson/readValuebecauseCassandraBaseDAO's are package-private. Same workaroundCassandraSchedulerDAOuses.Tests
146 new test methods across 24 test classes + 1 property-style class (200 random reps per run), 0 failures.
Port (webhooks-oss): 78 tests
TargetWorkflowCollector, idempotency substitutionWebhookConfigService(9) — CRUD, conflict, secret sanitization (mutation regression), NotFound on updateWebhookConfigResource(8) — validation, 404s, secret sanitizationInMemoryWebhookDAO(9) — CRUD + matcher computation against mocked MetadataDAOInMemoryWebhookTaskService(6) — hash semantics, bucketing, missing-matches throwsIncomingWebhookResource(2) — delegationWebhooksystem task (1) —put()registrationWebhookWorker(9) — handleMessage paths, pollAndExecuteStripeVerifier(3) — null api_version accepted, bad signature rejected, missing header rejectedWebhooksOssEndToEndTest(4) — full bean graph: register → receive → enqueue → worker dispatch → workflow start /WAIT_FOR_WEBHOOKcompletionWebhookHashContractPropertyTest(2 × 100 reps) — property-style: hash-site agreement acrossInMemoryWebhookTaskServiceandWebhookHashingService.computeJsonHashPersistence: 68 tests (most testcontainer-based)
PostgresWebhookDAOTestPostgresWebhookTaskServiceTestPostgresWebhookCleanupJobTestMySQLWebhookDAOTestMySQLWebhookTaskServiceTestMySQLWebhookCleanupJobTestSqliteWebhookDAOTestSqliteWebhookTaskServiceTestSqliteWebhookCleanupJobTestRedisWebhookDAOTestRedisWebhookTaskServiceTestRedisWebhookCleanupJobTestCassandraWebhookDAOTestCassandraWebhookTaskServiceTestSmoke validation
Single-node (against
:conductor-server-lite:bootRun)The reproducible harness at nthmost-orkes/webhook-emitter/conductor-smoke registers a workflow, registers a webhook, starts the workflow, fires a signed event, polls until COMPLETED. Result:
Per-backend matrix (docker-compose + smoke)
webhook-emitter/conductor-smoke/matrix.shbrings up each backing's docker-compose stack against this branch'sconductor:serverimage, then runs the 6-verifier smoke against it.:conductor-server-lite:bootRunresolvesconductor.db.type=sqlitefromapplication.properties, so the local bootRun smoke is the sqlite functional pass. Also covered by:conductor-sqlite-persistence:test(16/16 green locally).flywaybean wiring issue at Spring startup — fails before webhook code loads; unrelated to this PR. Testcontainer tests will validate viaMySQLWebhookDAOTestetc.com.codahale.metrics.JmxReportertransitive dep in server classpath; fails on existingcassandraMetadataDAObean, not webhook code. Testcontainer tests will validate viaCassandraWebhookDAOTestetc.The mysql/cassandra docker failures are upstream of any
WebhookDAOcode path — both fail at Spring bootstrap of pre-existing beans (mySqlMetadataDAO/cassandraMetadataDAO). Worth filing separately againstconductor-oss/conductor's docker image setup, but they don't gate this PR.Full battery (conductor-server-lite SQLite, loki)
8/8 harnesses passed against this branch's JAR on a remote host (loki, Ubuntu 24.04, Java 21, SQLite persistence):
smoke.py— verifier matrixnegative_smoke.py— failure modesgap_smoke.py— coverage gapscomposite_smoke.py— all system tasksmulti_verifier_smoke.py— 3 configs, 1 wfbucket_smoke.py— N=20 fan-outreplay_smoke.py— observationload_smoke.py— 100 wf, concurrency=10Verified
./gradlew :conductor-webhooks-oss:test— 78/78 unit + 200/200 property reps green./gradlew :conductor-sqlite-persistence:test --tests '*Webhook*'— 16/16 green./gradlew :conductor-{core,webhooks-oss,postgres,mysql,sqlite,redis,cassandra}-persistence:compileJava— clean./gradlew :conductor-server:compileJava— clean (full server tree)./gradlew :conductor-*:spotlessCheck(all touched modules) — clean:conductor-server-lite— full webhook → workflow chain succeededWhat's NOT here (and why)
EventMessage/EventMessageServiceas an enterprise-only observability layer (Redis-backed, org-scoped). It doesn't belong in OSS, and the decision from review is thattimeoutSecondsonWAIT_FOR_WEBHOOKtasks is the correct negative-path mechanism — consistent with Conductor's design philosophy.EventMessage.javawas removed.Known follow-ups (out of scope, file as separate PRs)
mysql-persistencedocker image:flywaybean wiring at startup (existing issue, surfaces under matrix harness)cassandra-persistencedocker image: missingcom.codahale.metrics.dropwizard-metrics-jmxtransitive dep/api/metadata/webhookand/api/webhook/{id}Test plan
/api/webhook/{id}→ verifier runs →WebhookTaskMapperdispatches → workflow completesconductor.webhook.worker.*properties bind viaWebhookWorkerPropertiesconductor.webhooks.cleanup.*properties bind via Spring environmentStats
91 files changed, ~9,600 LoC added net, 23 commits. Originally split as #1106 (port, ~5.5K LoC, 54 files) + #1110 (persistence, ~4.3K LoC, 39 files); folded per leadership alignment. 6 additional cleanup commits from review round.