Skip to content

Commit 48ba5d0

Browse files
spongkibanamachine
andauthored
[Security Solution] Register Workflows with Inbox (elastic#266256)
## Summary Introduces **Inbox ↔ Workflows** as a first-class integration so that any `waitForInput` step in a Workflow surfaces as a schema-driven, schema-validated action in the shared **Inbox** plugin, with a responsive flyout for human resolution. Lays the platform rails (registry, common contracts, privilege model, space scoping, demo fixtures) so future providers can plug in with zero additional Inbox work. Plugin introduced in elastic#265634, and with this changeset data now comes from real workflows! Plugin is disabled by default, so enable via: ``` xpack.inbox.enabled: true ``` And populate sample workflows exercising each schema variation with: ``` KIBANA_URL=http://localhost:5601 \ KIBANA_USERNAME=elastic \ KIBANA_PASSWORD=changeme \ KIBANA_SPACE_ID=default \ node --import tsx x-pack/platform/plugins/shared/inbox/scripts/demo/seed_inbox_demo.ts ``` (or using [this branch](https://github.com/spong/example-mcp-app-security/tree/inbox) of the `example-mcp-app-security`. All setup details in [`/inbox/scripts/demo/README.md`](https://github.com/spong/kibana/blob/12e3c1b50762f4f7979327007ac652ee89378bc8/x-pack/platform/plugins/shared/inbox/scripts/demo/README.md)) <p align="center"> <img width="700" src="https://github.com/user-attachments/assets/14c76d17-4319-46d1-8b39-298418398f58" /> </p> Complete with support for dynamic response actions: <p align="center"> <img width="700" src="https://github.com/user-attachments/assets/67d9c867-dee6-454c-8a16-3dfcedc3285d" /> </p> ## What's in this PR ### Inbox plugin — provider platform - **`InboxActionProvider` contract** (`x-pack/platform/plugins/shared/inbox/server/services/inbox_action_provider.ts`) — the registration shape plugins implement to contribute items to the Inbox. - **`InboxActionRegistry`** — fan-out / merge-sort / pagination across providers, with a clamped `total` so a provider truncating its response can never desync the UI's pager. Logs a warning on truncation. - **Routes** - `GET /internal/inbox/actions` — list + filter (status, source_app, pagination) - `POST /internal/inbox/actions/{source_app}/{source_id}/respond` — schema-validated respond path - **Security** - Split API privileges `inbox_read` / `inbox_respond` so the `read` feature role cannot invoke the respond route (previously collapsed under a single `api: [PLUGIN_ID]`). - Dynamic space resolution via the `spaces` plugin — no more hardcoded `'default'` leaking rows across spaces. Falls back to `'default'` only when `spaces` is absent (single-space installs). - **Public app hardening** - Lazy detail-renderer loader catches chunk-load / module-init failures and falls back to the default form instead of crashing the tree. - `TimeoutChip` guards against malformed `timeout_at` so it can't render `NaNh NaNm`. - `SchemaForm` accessibility pass: label fallbacks to field name, visual `*` required marker, `hasChildLabel={false}` for `EuiSwitch` rows, localized select placeholder. - **React-Query tuning** (`public/application.tsx`): `refetchOnWindowFocus: 'always'` + `refetchOnMount: 'always'` with a 30s `staleTime` dedupe window so the list is always current when the tab/app regains focus, plus locked-in `invalidateQueries` on the respond mutation so responded items drop off the list without a manual refresh. ### Workflows-management — Inbox provider - New `WorkflowsInboxProvider` (`src/platform/plugins/shared/workflows_management/server/inbox/`) converts paused `waitForInput` steps into `InboxAction`s (`to_inbox_action.ts`) and registers with the Inbox plugin at setup. - New `WorkflowExecutionQueryService#listWaitingForInputSteps` to fan-out across the executions index for paused steps (filtered by `spaceId` + `status: waiting_for_input`); `WorkflowsManagementService` exposes it via the same-name delegating method consistent with the rest of the facade. Wire-level regression tests live in `workflow_execution_query_service.test.ts` (term-only query / pagination math / `index_not_found_exception` swallowed / log+rethrow), with a delegation test in the facade suite. ### Common contracts (`@kbn/inbox-common`) - OpenAPI / Zod schemas for the list and respond routes, including `status`, `source_app`, `input_schema`, `submission_channel`, and `timeout_at`. - Helper `buildRespondToActionUrl(sourceApp, sourceId)` with proper URL-encoding for composite source IDs. - Privilege constants `INBOX_API_PRIVILEGE_READ` / `INBOX_API_PRIVILEGE_RESPOND`. ### Demo fixtures - `x-pack/platform/plugins/shared/inbox/scripts/demo/`: - Six minimal workflow YAMLs covering the field-type matrix (string / number / boolean / single-enum / array-of-enum / required+defaults). - `seed_inbox_demo.ts` — dependency-light `fetch` script that imports + triggers each workflow against a running Kibana so the Inbox populates immediately. - `README.md` with run instructions and the matching MCP-app `generate-inbox-data` flow. ### FTR wiring - New stateful FTR config `x-pack/platform/test/inbox_api_integration/config.ts` opt-in via `--xpack.inbox.enabled=true`. - `inbox_flow.ts` — contract-level coverage (empty-registry list, query validation, 404 on unknown source, 400 on missing `input`). Live-workflow lifecycle assertions are stubbed as `describe.skip` with TODOs pointing at the Workflows execution-engine harness work. - Registered in `.buildkite/ftr_platform_stateful_configs.yml` and `.github/CODEOWNERS`. ### Checklist Check the PR satisfies following conditions. Reviewers should verify this PR satisfies this list as well. - [X] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md) - [X] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [X] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - [X] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels. --- _PR developed with Cursor + Claude Opus 4.7 Super Duper xHigh Thinking++_ --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
1 parent cf6203b commit 48ba5d0

76 files changed

Lines changed: 4326 additions & 203 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.buildkite/ftr_platform_stateful_configs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -483,6 +483,7 @@ enabled:
483483
- x-pack/platform/test/api_integration_basic/apis/security/config.ts
484484
- x-pack/platform/test/automatic_import_api_integration/configs/config.stateful.ts
485485
- x-pack/platform/test/encrypted_saved_objects_api_integration/config.ts
486+
- x-pack/platform/test/inbox_api_integration/config.ts
486487
- x-pack/platform/test/fleet_multi_cluster/config.ts
487488
- x-pack/platform/test/monitoring_api_integration/config.ts
488489
- x-pack/platform/test/plugin_api_integration/config.ts

.github/CODEOWNERS

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1032,7 +1032,7 @@ x-pack/platform/packages/shared/kbn-event-stacktrace @elastic/obs-presentation-t
10321032
x-pack/platform/packages/shared/kbn-failure-store-modal @elastic/kibana-management
10331033
x-pack/platform/packages/shared/kbn-fs @elastic/kibana-security
10341034
x-pack/platform/packages/shared/kbn-grok-heuristics @elastic/obs-onboarding-team
1035-
x-pack/platform/packages/shared/kbn-inbox-common @elastic/security-pds-deployment @elastic/security-generative-ai
1035+
x-pack/platform/packages/shared/kbn-inbox-common @elastic/security-generative-ai
10361036
x-pack/platform/packages/shared/kbn-inference-cli @elastic/appex-ai-infra @elastic/search-inference-team
10371037
x-pack/platform/packages/shared/kbn-inference-connectors @elastic/search-kibana
10381038
x-pack/platform/packages/shared/kbn-inference-endpoint-ui-common @elastic/search-kibana
@@ -1165,7 +1165,7 @@ x-pack/platform/plugins/shared/fields_metadata @elastic/obs-onboarding-team
11651165
x-pack/platform/plugins/shared/fleet @elastic/fleet
11661166
x-pack/platform/plugins/shared/fleet/cypress @elastic/fleet
11671167
x-pack/platform/plugins/shared/global_search @elastic/appex-sharedux
1168-
x-pack/platform/plugins/shared/inbox @elastic/security-pds-deployment @elastic/security-generative-ai
1168+
x-pack/platform/plugins/shared/inbox @elastic/security-generative-ai
11691169
x-pack/platform/plugins/shared/index_management @elastic/kibana-management
11701170
x-pack/platform/plugins/shared/inference @elastic/search-kibana
11711171
x-pack/platform/plugins/shared/inference_endpoint @elastic/search-kibana
@@ -3435,6 +3435,9 @@ x-pack/solutions/observability/plugins/synthetics/server/saved_objects/synthetic
34353435
/.buildkite/pipelines/evals/ @elastic/obs-ai-team @elastic/security-generative-ai
34363436
/.buildkite/scripts/steps/evals/ @elastic/obs-ai-team @elastic/security-generative-ai
34373437

3438+
# Inbox FTR test directory (no kibana.jsonc — manual entry)
3439+
/x-pack/platform/test/inbox_api_integration/ @elastic/security-generative-ai
3440+
34383441
####
34393442
## These rules are always last so they take ultimate priority over everything else
34403443
####

src/platform/plugins/shared/workflows_execution_engine/integration_tests/tests/wait_for_input.test.ts

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -237,4 +237,73 @@ steps:
237237
expect(step?.output).toEqual({ ticket: 'T-42', priority: 'high' });
238238
});
239239
});
240+
241+
describe('resume after workflow-level timeout has already fired', () => {
242+
// Regression: an analyst responds to an Inbox entry whose parent workflow
243+
// has sat paused long enough to exceed `settings.timeout`. On the resume
244+
// iteration the workflow-level monitor fires concurrently with the
245+
// waitForInput step, aborts the step runtime, and calls
246+
// `failStep(TimeoutError)`. Before the fix, `WaitForInputStepImpl.run()`
247+
// then re-entered `tryEnterWaitUntil` (which saw the in-memory status as
248+
// FAILED, not WAITING_FOR_INPUT), and the subsequent upsert overwrote
249+
// `status` back to `waiting_for_input` — leaving `error`/`finishedAt`
250+
// in place and producing a zombie that the Inbox listing keeps
251+
// resurfacing forever.
252+
const timeoutYaml = `
253+
settings:
254+
timeout: 200ms
255+
256+
steps:
257+
- name: ask
258+
type: waitForInput
259+
with:
260+
message: "Approve?"
261+
- name: log
262+
type: console
263+
with:
264+
message: "done"
265+
`;
266+
267+
it('must not reset a timed-out step back to WAITING_FOR_INPUT on resume', async () => {
268+
await fixture.runWorkflow({ workflowYaml: timeoutYaml });
269+
270+
const pausedExec = fixture.workflowExecutionRepositoryMock.workflowExecutions.get(
271+
'fake_workflow_execution_id'
272+
);
273+
expect(pausedExec?.status).toBe(ExecutionStatus.WAITING_FOR_INPUT);
274+
275+
// Push us comfortably past the 200ms workflow-level timeout so the
276+
// monitor's first pass during the resume iteration fires
277+
// synchronously, racing with the step's run().
278+
await new Promise((resolve) => setTimeout(resolve, 350));
279+
280+
const exec = fixture.workflowExecutionRepositoryMock.workflowExecutions.get(
281+
'fake_workflow_execution_id'
282+
)!;
283+
exec.context = { ...exec.context, resumeInput: { approved: true } };
284+
fixture.workflowExecutionRepositoryMock.workflowExecutions.set(exec.id, exec);
285+
286+
await fixture.resumeWorkflow();
287+
288+
const askStep = Array.from(fixture.stepExecutionRepositoryMock.stepExecutions.values()).find(
289+
(s) => s.stepId === 'ask'
290+
);
291+
292+
// The zombie shape we must never persist again: status=waiting_for_input
293+
// alongside finishedAt+error. Either the step is FAILED/TIMED_OUT (the
294+
// timeout won the race and the fix prevented re-entry), or the step is
295+
// COMPLETED (the step ran first and the resume succeeded before the
296+
// monitor noticed the deadline). Both are acceptable; the zombie is not.
297+
expect(askStep).toBeDefined();
298+
expect([
299+
ExecutionStatus.FAILED,
300+
ExecutionStatus.TIMED_OUT,
301+
ExecutionStatus.COMPLETED,
302+
]).toContain(askStep!.status);
303+
if (askStep!.status !== ExecutionStatus.COMPLETED) {
304+
expect(askStep!.error?.message).toContain('Failed due to workflow timeout');
305+
expect(askStep!.finishedAt).toBeDefined();
306+
}
307+
});
308+
});
240309
});

src/platform/plugins/shared/workflows_execution_engine/server/plugin.ts

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1235,6 +1235,20 @@ export class WorkflowsExecutionEnginePlugin
12351235
);
12361236
}
12371237

1238+
// Freshness guard: a waitForInput step whose parent workflow already
1239+
// terminated (timeout, cancel, external failure) can leave the execution
1240+
// doc with `status: waiting_for_input` but `finishedAt` set — writing
1241+
// resumeInput and scheduling a resume task in that state is a no-op that
1242+
// silently swallows the analyst's response. Reject explicitly so the
1243+
// caller sees a 409 and the Inbox provider can surface a real error.
1244+
if (workflowExecution.finishedAt) {
1245+
throw new WorkflowExecutionInvalidStatusError(
1246+
executionId,
1247+
`${workflowExecution.status} (already finished at ${workflowExecution.finishedAt})`,
1248+
ExecutionStatus.WAITING_FOR_INPUT
1249+
);
1250+
}
1251+
12381252
await workflowExecutionRepository.updateWorkflowExecution({
12391253
id: executionId,
12401254
context: { ...workflowExecution.context, resumeInput: input },

src/platform/plugins/shared/workflows_execution_engine/server/step/wait_for_input_step/wait_for_input_step.test.ts

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ describe('WaitForInputStepImpl', () => {
4343
setInput: jest.fn(),
4444
updateWorkflowExecution: jest.fn(),
4545
stepExecutionId: 'test-step-exec-id',
46+
abortController: new AbortController(),
4647
contextManager: {
4748
renderValueAccordingToContext: jest.fn(<T>(v: T): T => v),
4849
},
@@ -222,6 +223,40 @@ describe('WaitForInputStepImpl', () => {
222223
});
223224
});
224225

226+
describe('aborted runtime — race with workflow-level timeout', () => {
227+
// Regression: when the workflow-level timeout monitor fires in parallel
228+
// with a resume iteration, it aborts the step runtime and calls
229+
// `failStep(timeoutError)`. Without this guard the waitForInput step
230+
// proceeded to re-enter its wait state, overwriting `status: FAILED` back
231+
// to `status: WAITING_FOR_INPUT` (error/finishedAt survived because
232+
// `updateStep` spreads). The zombie step then permanently reappeared in
233+
// the Inbox because `listWaitingForInputSteps` filters only on status.
234+
beforeEach(() => {
235+
mockStepExecutionRuntime.abortController.abort();
236+
});
237+
238+
it('should not call tryEnterWaitUntil when the runtime is already aborted', async () => {
239+
await underTest.run();
240+
expect(mockStepExecutionRuntime.tryEnterWaitUntil).not.toHaveBeenCalled();
241+
});
242+
243+
it('should not mutate step state when the runtime is already aborted', async () => {
244+
await underTest.run();
245+
expect(mockStepExecutionRuntime.setInput).not.toHaveBeenCalled();
246+
expect(mockStepExecutionRuntime.finishStep).not.toHaveBeenCalled();
247+
expect(mockStepExecutionRuntime.updateWorkflowExecution).not.toHaveBeenCalled();
248+
expect(mockWorkflowRuntime.navigateToNextNode).not.toHaveBeenCalled();
249+
});
250+
251+
it('should emit an observable hitl:aborted debug event', async () => {
252+
await underTest.run();
253+
expect(workflowLogger.logDebug).toHaveBeenCalledWith(
254+
expect.stringContaining('run aborted before wait-entry'),
255+
expect.objectContaining({ event: { action: 'hitl:aborted' } })
256+
);
257+
});
258+
});
259+
225260
describe('resume run — exiting wait state with null context', () => {
226261
beforeEach(() => {
227262
mockStepExecutionRuntime.tryEnterWaitUntil.mockReturnValue(false);

src/platform/plugins/shared/workflows_execution_engine/server/step/wait_for_input_step/wait_for_input_step.ts

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,22 @@ export class WaitForInputStepImpl implements NodeImplementation {
2323
) {}
2424

2525
async run(): Promise<void> {
26+
// The step runtime's abort signal is how monitors (workflow-level timeout,
27+
// cancellation) tell a step "you have already been settled — do not touch
28+
// state". Without this guard a waitForInput that is resumed after the
29+
// workflow has timed out would enter `tryEnterWaitUntil` with an in-memory
30+
// status of FAILED (set by the monitor's failStep call), treat itself as
31+
// "not already waiting", and re-write status back to WAITING_FOR_INPUT —
32+
// leaving a zombie step that `listWaitingForInputSteps` keeps surfacing in
33+
// the Inbox forever.
34+
if (this.stepExecutionRuntime.abortController.signal.aborted) {
35+
this.workflowLogger.logDebug(
36+
`Step '${this.node.stepId}' run aborted before wait-entry; skipping`,
37+
{ event: { action: 'hitl:aborted' } }
38+
);
39+
return;
40+
}
41+
2642
if (this.stepExecutionRuntime.tryEnterWaitUntil(undefined, ExecutionStatus.WAITING_FOR_INPUT)) {
2743
// Store step config as input so the record is self contained
2844
// consistent with every other step type & readable without a definition lookup

src/platform/plugins/shared/workflows_management/kibana.jsonc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,8 @@
3232
"embeddable",
3333
"licensing"
3434
],
35-
"optionalPlugins": ["alerting", "serverless", "cloud"],
36-
"runtimePluginDependencies": ["agentBuilder"],
35+
"optionalPlugins": ["alerting", "serverless", "cloud", "inbox"],
36+
"runtimePluginDependencies": ["agentBuilder", "agentContextLayer"],
3737
"extraPublicDirs": ["common"]
3838
}
3939
}

src/platform/plugins/shared/workflows_management/moon.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,8 @@ dependsOn:
9797
- '@kbn/cloud-plugin'
9898
- '@kbn/kbn-client'
9999
- '@kbn/rule-data-utils'
100-
- '@kbn/std'
100+
- '@kbn/inbox-plugin'
101+
- '@kbn/inbox-common'
101102
- '@kbn/core-saved-objects-common'
102103
tags:
103104
- plugin

src/platform/plugins/shared/workflows_management/server/api/workflows_management_api.ts

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -606,6 +606,17 @@ export class WorkflowsManagementApi {
606606
return workflowsExecutionEngine.resumeWorkflowExecution(executionId, spaceId, input, request);
607607
}
608608

609+
/**
610+
* Cross-workflow listing of step executions currently blocked on
611+
* `waitForInput`. Consumed by the Inbox plugin's workflows provider.
612+
*/
613+
public async listWaitingForInputSteps(
614+
spaceId: string,
615+
params: { page?: number; perPage?: number } = {}
616+
): Promise<{ results: EsWorkflowStepExecution[]; total: number }> {
617+
return this.workflowsService.listWaitingForInputSteps(spaceId, params);
618+
}
619+
609620
public async getWorkflowStats(spaceId: string, options?: { includeExecutionStats?: boolean }) {
610621
return this.workflowsService.getWorkflowStats(spaceId, options);
611622
}

src/platform/plugins/shared/workflows_management/server/api/workflows_management_service.test.ts

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -291,4 +291,22 @@ describe('WorkflowsService (facade)', () => {
291291
await expect(service.getWorkflow('wf-1', 'default')).rejects.toBe(boom);
292292
});
293293
});
294+
295+
describe('listWaitingForInputSteps', () => {
296+
it('delegates to WorkflowExecutionQueryService.listWaitingForInputSteps after init', async () => {
297+
// Behavioural coverage for this method lives next to the implementation
298+
// in `services/workflow_execution_query_service.test.ts`. This facade
299+
// test only asserts the delegation shape.
300+
const listSpy = jest
301+
.spyOn(WorkflowExecutionQueryService.prototype, 'listWaitingForInputSteps')
302+
.mockResolvedValue({ results: [], total: 0 } as never);
303+
try {
304+
const service = await buildService();
305+
await service.listWaitingForInputSteps('my-space', { page: 2, perPage: 25 });
306+
expect(listSpy).toHaveBeenCalledWith('my-space', { page: 2, perPage: 25 });
307+
} finally {
308+
listSpy.mockRestore();
309+
}
310+
});
311+
});
294312
});

0 commit comments

Comments
 (0)