Skip to content

[Spike][Security Solution] Detection Emulation Skill (Epic #15974) — substrate + orchestration + production-opt-in readiness pack#269019

Draft
patrykkopycinski wants to merge 92 commits into
elastic:mainfrom
patrykkopycinski:ao/detection-emulation-skill-4de85a
Draft

[Spike][Security Solution] Detection Emulation Skill (Epic #15974) — substrate + orchestration + production-opt-in readiness pack#269019
patrykkopycinski wants to merge 92 commits into
elastic:mainfrom
patrykkopycinski:ao/detection-emulation-skill-4de85a

Conversation

@patrykkopycinski
Copy link
Copy Markdown
Contributor

@patrykkopycinski patrykkopycinski commented May 13, 2026

Summary

Closes security-team#15974.

Lands the Detection Emulation Skill end-to-end on a single branch. An Agent Builder skill takes a candidate detection rule + target host(s), runs an emulated attack (real EDR action OR ECS log injection), polls the Detection Engine for the alerts the rule produces, and returns a ValidationReport with a confidence score.

Every action is gated by:

  • two off-by-default experimental flags (detectionEmulationRealExecution, detectionEmulationLogInjection)
  • a runtime kill switch (xpack.securitySolution.detectionEmulation.realExecution.enabled) for fast disable without redeploy
  • a default-deny endpoint allowlist (operator must explicitly permit hosts)
  • per-command RBAC re-checked at every tool boundary
  • a Zod-enforced ≤5 endpoint fanout cap per call
  • per-space (100/h) and per-host (3/h) rate limits with atomic acquire-with-rollback
  • ≤1 in-flight real_execution scenario per Kibana space
  • mandatory HITL confirmation on every destructive command path
  • a regex allowlist for free-form execute payloads (allowedExecuteCommandPatterns)
  • full actor attribution (user vs agent-builder, with conversationId / runId / toolCallId / SHA-256 prompt hash) in the audit trail
  • idempotency cache on tool dispatch gates (prevents double-fire on retried LLM tool calls)

93 files / +17,020 against upstream/main. Server-side orchestration only; existing UI surfaces unchanged. Allowlist + rate-limiter knobs are exposed as Stack Advanced Settings.

Architecture

What's new

Component Path Notes
Multi-EDR runner + route lib/detection_emulation/execution/runner.ts, api/dispatch/route.ts Vendor-agnostic dispatcher (execute, runscript, kill-process, isolate, release).
Payload library lib/detection_emulation/payloads/{payloads.json,index.ts} 12 Wave-1 entries, hard-capped at 15. Data-driven.
Scenario generator lib/detection_emulation/scenario_generator.ts Pure function. Deterministic scenarioId = sha256(...). Typed errors no_mitre_tags, no_supported_techniques.
Log injection mode lib/detection_emulation/log_injection/{generator,index_template,executor}.ts Dedicated .kibana-security-emulation-logs-<spaceId>-* index template w/ 7-day ILM. Synthetic ECS docs stamped with event.dataset, event.module: 'emulation', and tagged with emulation: { mode, emulationId, scenarioId }.
Telemetry collector lib/detection_emulation/telemetry_collector.ts poll + one_shot modes. AbortController. Wall-budget enforcement. Queries alerts by kibana.alert.original_event.module: emulation OR emulation tag via bool.should.
Confidence scorer lib/detection_emulation/confidence_scorer.ts confidence = round(coverage * 0.6 + precision * 0.4, 2) clamped [0, 1]. Pure.
Validation gate lib/detection_emulation/execution/validation_gate.ts Pre-dispatch gate that combines RBAC, allowlist, fanout, regex allowlist for free-form execute, and rate-limit acquire.
Gate primitives agent_builder/skills/detection_emulation/gate_checks.ts Reusable withCommandGates decorator extracted from per-family tool boilerplate. Composes feature-flag → auth → RBAC → allowlist → per-host rate → per-space rate → fanout cap → validation gate → audit in a single pipeline.
Emulation history SO lib/detection_emulation/emulation_history/{create,get,find,index}.ts + emulation_report_type.ts Hidden SO, write-once via scenarioFingerprint dedup, namespace-scoped. Model version 2 adds actor field with data_backfill: { kind: 'user' } for old rows.
validateRule route lib/detection_emulation/api/validate_rule/route.ts POST /internal/detection_engine/emulation/validate_rule. Eight-step pipeline. Typed 4xx errors. All strings i18n-translated.
Skill tools (six) agent_builder/skills/detection_emulation/{validate_rule_tool,get_emulation_history_tool,create_run_family_command_tool}.ts Factory-based per-family tools (process / file / network / execution) built via createRunFamilyCommandTool. Each runs through withCommandGates. validateRule and getEmulationHistory are standalone tools.
HITL primitives agent_builder/skills/detection_emulation/build_emulation_confirmation.ts + per-family tools Each per-family tool declares confirmation: { askUser: 'once', getConfirmation }; validateRule does an on-demand HITL prompt when mode === 'real_execution'. Skipped only in executionMode === 'standalone' (eval / A2A).
EmulationToolError agent_builder/skills/detection_emulation/emulation_tool_errors.ts Typed error builder for structured error codes (feature_flag_disabled, endpoint_not_allowed, etc.) with consistent shape across all tools.
Skill content agent_builder/skills/detection_emulation/detection_emulation_skill.ts Six tools registered. referencedContent array with MITRE ATT&CK overview. Section order: When-to-Use → Process → Examples → Guardrails → Response Format.
Advanced Settings common/experimental_features.ts, server/config.ts, runtime config resolver Allowlist + rate-limiter knobs surface in Stack Management → Advanced Settings. Operators can override per-space without restart.
kbn-evals suite packages/kbn-evals-suite-detection-emulation/evals/{validate_rule_dataset,validate_rule.spec}.ts Dataset name security: detection-emulation-validate-rule. Eight examples (success / failure / log-injection / distractor / userDeclines HITL rejection / realExecBlocked allowlist 403). Evaluators: toolSelection, schemaCompliance, criteria, plus a trajectory evaluator that scores tool_sequence from execution traces.
Shared utilities resolveCurrentUsername moved to shared lib, savedObjectsClient threaded through gates Aligns with patterns from andrew-goldstein's Workflows stack.

Pipeline (route execution order)

 1. Feature-flag gate     detectionEmulationRealExecution      (always required)
                          detectionEmulationLogInjection       (required if mode='log_injection')
 2. Runtime kill switch   xpack.securitySolution.detectionEmulation.realExecution.enabled
 3. Authentication        emulation actions are attributable
 4. RBAC                  per-command authz over the selected payload
 5. Schema gates          MAX_ENDPOINT_FANOUT (5) at Zod boundary
 6. Allowlist             default-deny; operator config drives allowedHosts
 7. Per-space rate        100 commands / space / hour
 8. Per-host rate         3 commands / host / hour (atomic acquire-with-rollback)
 9. Concurrency gate      ≤1 real_execution in flight per space
10. Validation gate       free-form `execute` matched against allowedExecuteCommandPatterns
11. HITL                  framework prompts user; skipped only in standalone executionMode
12. Scenario generator    rule MITRE tags → payload set → deterministic scenarioId
13. Per-payload dispatch  real_execution → execution/runner.ts (multi-EDR)
                          log_injection  → log_injection/executor.ts (synthetic ECS docs)
14. Audit                 actor.kind discriminator + via= suffix on response-action comment
15. Telemetry collector   poll Detection Engine alerts filtered by original_event.module + scenarioId
16. Confidence scorer     coverage * 0.6 + precision * 0.4
17. History write         persist detection-emulation-report SO (model version 2 with actor)
   ────────────────────   ValidationReport (HTTP 200) or typed error (4xx/5xx)

Failure-mode coverage

HTTP errorCode When
403 feature_flag_disabled Mode requires a flag that's off, or runtime kill switch is set.
403 endpoint_not_allowed endpointId not on the operator allowlist.
403 command_not_allowed Free-form execute payload didn't match any allowedExecuteCommandPatterns.
403 user_declined HITL prompt rejected by the user.
422 endpoint_fanout_exceeded More than MAX_ENDPOINT_FANOUT (5) endpoints in one call.
422 no_mitre_tags Rule has no MITRE techniques.
422 no_supported_techniques Rule MITRE tags don't intersect Wave-1.
404 rule_not_found ruleId doesn't resolve.
429 rate_limit_exceeded Per-space (100/h) or per-host (3/h) bucket exhausted. Response includes blockedEndpoints + Retry-After.
429 concurrency_exceeded Another real_execution scenario is in flight in this space. Response includes inflight_scenario_fingerprint + Retry-After.
200 + caveats wall_budget_exceeded Partial result; score over partial observations.
500 es_bulk_error log_injection ES bulk write failure.

Key refactors (latest iteration)

  • Factory-based per-family tools: createRunFamilyCommandTool replaces copy-pasted per-family tool files. One factory, parameterized by family name and command schema.
  • Reusable gate primitives: withCommandGates extracted from per-family tools — the REST route now reuses the same gate pipeline.
  • Typed error builder: EmulationToolError.from(code, message) replaces ad-hoc error construction.
  • Dead code removal: removed unused imports, legacy helpers, and orphaned test fixtures.
  • resolveCurrentUsername shared: moved to shared lib for reuse across tools and routes.
  • savedObjectsClient threading: threaded through gate checks instead of resolving ad-hoc per tool.
  • Telemetry collector query fix: uses bool.should with minimum_should_match: 1 to match alerts by kibana.alert.original_event.module: emulation OR emulation tag — ensures confidence > 0 for log_injection mode.
  • Skill referencedContent schema compliance: corrected from object to array format per referencedContentSchema in type_definition.ts.
  • Idempotency cache: tool dispatch gates cache recent calls to prevent double-fire on retried LLM tool invocations.

Demo plan

  1. Enable both experimental flags in config/kibana.dev.yml (full snippet in DEMO.md).
  2. Configure a restrictive allowlist: xpack.securitySolution.detectionEmulation.allowlist.endpointIds: [<your-pilot-host>].
  3. Pick a rule whose MITRE technique intersects Wave-1 (e.g. T1059.001 Windows PowerShell).
  4. From Agent Builder UI: "Validate detection rule <ruleId> against endpoint <endpointId> using log injection."
  5. The framework prompts for HITL confirmation. Approve.
  6. Inspect the ValidationReport: confidence, coverage, precision, per-phase breakdown, history SO id.
  7. Inspect the response-action comment for via=agent-builder/conv:<id>/run:<id> actor attribution.
  8. Try fanning out to 6+ endpoints — schema rejects with endpoint_fanout_exceeded.
  9. Try a second real_execution while the first is in flight — second is rejected with concurrency_exceeded + Retry-After.
  10. Follow up: "Show recent emulation runs for rule <ruleId>."getEmulationHistory returns paginated history with the actor field populated.

Full walkthrough: x-pack/solutions/security/plugins/security_solution/server/lib/detection_emulation/DEMO.md.

Test plan

  • eslint clean on changed files
  • type_check clean on security_solution
  • jest green: node scripts/jest --testPathPattern='detection_emulation|emulation_report_type|validation_gate'
  • detectionEmulationLogInjection defaults to false
  • Allowlist defaults to deny-all; operator config required to permit any host
  • Audit trail attributes every action to actor.kind (user vs agent-builder) with conversation/run/tool ids and prompt hash
  • Endpoint fanout capped at MAX_ENDPOINT_FANOUT = 5 at the Zod boundary
  • Per-host rate limit enforced (3/host/hour, atomic acquire-with-rollback)
  • Concurrency gate caps in-flight real_execution per space (1)
  • Free-form execute payloads gated by allowedExecuteCommandPatterns regex allowlist
  • Runtime kill switch (detectionEmulation.realExecution.enabled) returns 403 when disabled
  • HITL prompt on every destructive command path; skipped only in standalone executionMode
  • Skill content follows mandated section order; mentions every gate in the Guardrails table
  • CODEOWNERS updated for new directories
  • E2E log_injection pipeline verified: emulation log → detection rule fires → alert with original_event.module: emulation → telemetry collector finds alert → confidence > 0
  • Agent Builder skill loads without ZodError (referencedContent schema compliance)
  • kbn-evals dataset run against a live cluster (needs ES + a connector — to be run manually before flipping the flag in any beta env)
  • Smoke spec (detection_emulation.integration.test.ts) with EMULATION_SMOKE_ES_URL against a live ES (manual; documented in README)

Out of scope

  • Beyond Wave-1 payloads. Hard-capped at 15 entries by design. Adding more is a follow-up after operator feedback on the Wave-1 set.
  • Cross-space concurrency. The gate is per-space. If real_execution should be globally serialized, that's a follow-up keyed on operator demand.
  • UI for emulation history. The getEmulationHistory tool exposes the SO via the Agent Builder; a dedicated stack-management UI is intentionally out of scope.
  • Live-cluster eval baseline. The kbn-evals suite ships with deterministic mocks; a live-cluster baseline run + result publication is a separate follow-up because it requires connector + cluster setup outside of CI.

@cla-checker-service
Copy link
Copy Markdown

cla-checker-service Bot commented May 13, 2026

❌ Author of the following commits did not sign a Contributor Agreement:
5638057, de325db, d119178, 6f86b85, cfc20c1

Please, read and sign the above mentioned agreement if you want to contribute to this project

@infra-vault-gh-plugin-prod
Copy link
Copy Markdown

🤖 Jobs for this PR can be triggered through checkboxes. 🚧

ℹ️ To trigger the CI, please tick the checkbox below 👇

  • Click to trigger kibana-pull-request for this PR!
  • Click to trigger kibana-deploy-project-from-pr for this PR!
  • Click to trigger kibana-deploy-cloud-from-pr for this PR!
  • Click to trigger kibana-entity-store-performance-from-pr for this PR!
  • Click to trigger kibana-storybooks-from-pr for this PR!

@patrykkopycinski patrykkopycinski force-pushed the ao/detection-emulation-skill-4de85a branch from e365164 to 065b6ed Compare May 13, 2026 09:10
@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

@kibanamachine
Copy link
Copy Markdown
Contributor

🤖 Prompt Changes Detected

Changes have been detected to one or more prompt files in the Elastic Assistant plugin.

Please remember to update the integrations repository with your prompt changes to ensure consistency across all deployments.

Next Steps:

  1. Follow the documentation in x-pack/solutions/security/packages/security-ai-prompts/README.md to update the corresponding prompt files
  2. Make the changes in the integrations repository
  3. Test your changes in the integrations environment
  4. Ensure prompt consistency across all deployments

This is an automated reminder to help maintain prompt consistency across repositories.

@patrykkopycinski patrykkopycinski force-pushed the ao/detection-emulation-skill-4de85a branch from 6747034 to 44329de Compare May 13, 2026 11:25
@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

@patrykkopycinski patrykkopycinski changed the title [Security Solution] Detection Emulation Skill (Epic #15974) — substrate + orchestration layer [Security Solution] Detection Emulation Skill (Epic #15974) — substrate + orchestration + production-opt-in readiness pack May 13, 2026
@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

patrykkopycinski added a commit to patrykkopycinski/kibana that referenced this pull request May 13, 2026
The post-cleanup CI build (b441998) on PR elastic#269019 surfaced four real
breakages introduced by recent commits + the cleanup itself:

1. **TS typo** in validate_rule_tool.ts:515 — concurrencyResult exposes
   `inflightScenarioFingerprint`, not `inflightFingerprint`. The route
   file already uses the right name; only the tool path had drifted.

2. **TS typing** in run_command_tools.test.ts — the parameterized
   `it.each(tools)` table mixes per-family schemas with different
   `command` literal unions; destructuring `parameters` failed because
   one entry didn't have it, and `getConfirmation`'s `toolParams`
   intersected to `never` across the four families. Made `parameters`
   a uniform widened type and cast `getConfirmation` through
   `unknown` to a single shape — runtime contract unchanged, all
   62 tests still pass.

3. **TS-projects linter** rejected validate_rule.spec.ts as a stranded
   file (excluded from the security_solution tsconfigs but not part of
   any other TS project). The canonical fix — matching every other
   eval suite in the repo (kbn-evals-suite-pci-compliance, etc.) — is
   to extract the spec + its dataset into a sibling devOnly
   `functional-tests` package: `@kbn/evals-suite-detection-emulation`
   under `x-pack/solutions/security/packages/`. This keeps the
   production plugin clean of the devOnly `@kbn/evals` reference and
   gives the spec a real owning tsconfig. CODEOWNERS updated to
   point at the new path.

4. **Moon project regen** — running `node scripts/regenerate_moon_projects.js
   --update` registered the new package in `package.json`,
   `tsconfig.base.json`, and `yarn.lock`. Auto-generated; included.

Verification (local):
- jest: 88/88 pass across run_command_tools.test.ts (62) +
  validate_rule_tool.test.ts (6) + concurrency_gate.test.ts (10) +
  validate_rule/route.test.ts (10).
- eslint --fix: clean on all touched files.
- type_check: clean on the new evals suite package (~70s) AND on the
  full security_solution plugin (~6m).

No behavioural changes; this is purely a CI-repair commit.
@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

6 similar comments
@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

@kibanamachine
Copy link
Copy Markdown
Contributor

kibanamachine commented May 14, 2026

💔 Build Failed

Failed CI Steps

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
securitySolution 9408 9417 +9

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
securitySolution 12.1MB 12.1MB +8.3KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
securitySolution 153.5KB 159.3KB +5.8KB
Unknown metric groups

async chunk count

id before after diff
securitySolution 113 112 -1

ESLint disabled in files

id before after diff
securitySolution 106 107 +1

ESLint disabled line counts

id before after diff
securitySolution 739 740 +1

Total ESLint disabled count

id before after diff
securitySolution 845 847 +2

History

patrykkopycinski added a commit to patrykkopycinski/kibana that referenced this pull request May 15, 2026
The post-cleanup CI build (b441998) on PR elastic#269019 surfaced four real
breakages introduced by recent commits + the cleanup itself:

1. **TS typo** in validate_rule_tool.ts:515 — concurrencyResult exposes
   `inflightScenarioFingerprint`, not `inflightFingerprint`. The route
   file already uses the right name; only the tool path had drifted.

2. **TS typing** in run_command_tools.test.ts — the parameterized
   `it.each(tools)` table mixes per-family schemas with different
   `command` literal unions; destructuring `parameters` failed because
   one entry didn't have it, and `getConfirmation`'s `toolParams`
   intersected to `never` across the four families. Made `parameters`
   a uniform widened type and cast `getConfirmation` through
   `unknown` to a single shape — runtime contract unchanged, all
   62 tests still pass.

3. **TS-projects linter** rejected validate_rule.spec.ts as a stranded
   file (excluded from the security_solution tsconfigs but not part of
   any other TS project). The canonical fix — matching every other
   eval suite in the repo (kbn-evals-suite-pci-compliance, etc.) — is
   to extract the spec + its dataset into a sibling devOnly
   `functional-tests` package: `@kbn/evals-suite-detection-emulation`
   under `x-pack/solutions/security/packages/`. This keeps the
   production plugin clean of the devOnly `@kbn/evals` reference and
   gives the spec a real owning tsconfig. CODEOWNERS updated to
   point at the new path.

4. **Moon project regen** — running `node scripts/regenerate_moon_projects.js
   --update` registered the new package in `package.json`,
   `tsconfig.base.json`, and `yarn.lock`. Auto-generated; included.

Verification (local):
- jest: 88/88 pass across run_command_tools.test.ts (62) +
  validate_rule_tool.test.ts (6) + concurrency_gate.test.ts (10) +
  validate_rule/route.test.ts (10).
- eslint --fix: clean on all touched files.
- type_check: clean on the new evals suite package (~70s) AND on the
  full security_solution plugin (~6m).

No behavioural changes; this is purely a CI-repair commit.
@patrykkopycinski patrykkopycinski force-pushed the ao/detection-emulation-skill-4de85a branch from 9e4b13f to 2277d28 Compare May 15, 2026 08:14
@patrykkopycinski patrykkopycinski changed the title [Security Solution] Detection Emulation Skill (Epic #15974) — substrate + orchestration + production-opt-in readiness pack [Spike][Security Solution] Detection Emulation Skill (Epic #15974) — substrate + orchestration + production-opt-in readiness pack May 20, 2026
@patrykkopycinski patrykkopycinski force-pushed the ao/detection-emulation-skill-4de85a branch from 2277d28 to f0df0dd Compare May 25, 2026 18:08
patrykkopycinski added a commit to patrykkopycinski/kibana that referenced this pull request May 25, 2026
The post-cleanup CI build (b441998) on PR elastic#269019 surfaced four real
breakages introduced by recent commits + the cleanup itself:

1. **TS typo** in validate_rule_tool.ts:515 — concurrencyResult exposes
   `inflightScenarioFingerprint`, not `inflightFingerprint`. The route
   file already uses the right name; only the tool path had drifted.

2. **TS typing** in run_command_tools.test.ts — the parameterized
   `it.each(tools)` table mixes per-family schemas with different
   `command` literal unions; destructuring `parameters` failed because
   one entry didn't have it, and `getConfirmation`'s `toolParams`
   intersected to `never` across the four families. Made `parameters`
   a uniform widened type and cast `getConfirmation` through
   `unknown` to a single shape — runtime contract unchanged, all
   62 tests still pass.

3. **TS-projects linter** rejected validate_rule.spec.ts as a stranded
   file (excluded from the security_solution tsconfigs but not part of
   any other TS project). The canonical fix — matching every other
   eval suite in the repo (kbn-evals-suite-pci-compliance, etc.) — is
   to extract the spec + its dataset into a sibling devOnly
   `functional-tests` package: `@kbn/evals-suite-detection-emulation`
   under `x-pack/solutions/security/packages/`. This keeps the
   production plugin clean of the devOnly `@kbn/evals` reference and
   gives the spec a real owning tsconfig. CODEOWNERS updated to
   point at the new path.

4. **Moon project regen** — running `node scripts/regenerate_moon_projects.js
   --update` registered the new package in `package.json`,
   `tsconfig.base.json`, and `yarn.lock`. Auto-generated; included.

Verification (local):
- jest: 88/88 pass across run_command_tools.test.ts (62) +
  validate_rule_tool.test.ts (6) + concurrency_gate.test.ts (10) +
  validate_rule/route.test.ts (10).
- eslint --fix: clean on all touched files.
- type_check: clean on the new evals suite package (~70s) AND on the
  full security_solution plugin (~6m).

No behavioural changes; this is purely a CI-repair commit.
patrykkopycinski and others added 7 commits May 27, 2026 11:56
Adds a detection emulation feature directly to the security_solution
plugin. Users can run, approve, and visualise emulation commands against
detection alerts without needing a separate plugin.

What is added:

- common/detection_emulation: Zod schema for the run command input
- public/detections/components/emulation:
  - EmulationBadge — shows on alerts that carry an emulation id
    (kibana.alert.emulation.id)
  - EmulationFilter — toolbar filter on detection tables
  - RunEmulationModal — approval modal for a pending emulation command
- server/agent_builder/skills/detection_emulation:
  - In-tree agent skill plus an inline run-command tool
- server/lib/detection_emulation:
  - Rule binding saved object + alert tagging helpers
  - Feature flag, allowlist, audit logger, rate limiter, runner
  - REST route for executing emulation commands

Wire-up:

- Register `registerDetectionEmulationRoutes` from server/routes/index.ts
- Register `emulationRuleBindingType` in server/saved_objects.ts
- Register `getDetectionEmulationSkill` in
  agent_builder/skills/register_skills.ts (passes `core` + `config`
  threaded through from server/plugin.ts)
- Re-export `defineSkillType` from `@kbn/agent-builder-server` for skill
  authors
- Re-export `RunEmulationCommandInputSchema` from common/index.ts
- Add `DETECTION_ENGINE_EMULATION_*` URL constants
- Add CODEOWNERS entry for `server/lib/detection_emulation`

UI integration:

- additional_toolbar_controls.tsx: render <EmulationFilter> on detection
  tables
- render_cell_value.tsx: render <EmulationBadge> for alerts that carry an
  emulation id
- rule_details/index.tsx: render <RunEmulationModal> when an emulation
  approval is pending

Tests:

- Unit tests for the new skill and run-command tool
- Unit tests for emulation badge / filter / modal
- Integration test for the end-to-end emulation route + persistence

Notes for reviewers:

- All cross-package imports use canonical `@kbn/...` aliases. Intra-plugin
  imports remain relative per Kibana convention.
- This PR contains only production-ready changes.
Applies the full review pass against the in-tree detection emulation
feature added in 9f9f073 — closes the blocker, important, and
nice-to-have findings surfaced during review.

Server / route
- B1/B3/N5/N6/N7: route enforces `experimentalFeatures
  .detectionEmulationRealExecution`, swaps the rate limiter to atomic
  acquire/release (release on dispatch failure), refuses to dispatch
  destructive actions without an authenticated caller (401 instead of
  falling back to `username='unknown'`), short-circuits double-submits
  via an in-memory idempotency cache keyed on (space, emulation,
  command, agentType, sorted endpointIds), and wires allowlist /
  rate-limiter / idempotency-cache config from
  `xpack.securitySolution.detectionEmulation.*`.
- I1/I3/I4: replaces legacy `tags: ['access:securitySolution']` with
  declarative `security.authz.requiredPrivileges`, stops echoing
  internal error messages to clients, and wraps every user-facing
  string in `i18n.translate`.
- I2/N4: introduces a typed runner error taxonomy
  (`UnsupportedAgentTypeError`, `UnsupportedCommandForAgentTypeError`,
  `MissingConnectorActionsError`) in its own module so the route can
  map cleanly to 4xx/5xx, and adds an exhaustiveness check on the
  dispatch switch.
- I5/I7: marks the `emulation-rule-binding` SO `hidden: true` /
  `hiddenFromHttpApis: true` and adds a `modelVersions` baseline; the
  runner now accepts a `ruleBindingLookup` so dispatched actions carry
  ruleId / ruleName via a new `createSavedObjectRuleBindingLookup`
  helper that uses the internal SO client.

Schema / contract
- I6: rewrites `RunEmulationCommandInputSchema` as a discriminated
  union on `command` with strict, command-specific `parameters`
  shapes — closes the silent-passthrough hole where typos like
  `entityId` used to sail through the previous `z.record`. Models
  `kill-process` / `suspend-process` as a `pid` xor `entity_id` union
  and `memory-dump` as a `kernel | process(pid|entity_id)` union
  (z.union, since v4 forbids duplicate discriminator values).

Skill / agent-builder
- B5/I16: rewrites the skill content to match the actually-registered
  tool and command list, and gates skill registration on
  `detectionEmulationRealExecution`.
- Tool now returns `BuiltinSkillBoundedTool` (not
  `BuiltinToolDefinition`) so it satisfies the framework's
  `SkillBoundedTool` contract; basePath moved to the canonical
  `skills/security/endpoint`.

UI
- B4: removes the dead `RunEmulationModal` block from
  `rule_details/index.tsx`.
- I9: `EmulationFilter` subscribes to `filterManager.getUpdates$()` so
  the toggle stays in sync when filters are mutated elsewhere.
- I10/I11/I12: `RunEmulationModal` resets local state on
  `requestId`/`suggestion` change, disables Approve/Reject after
  click via `isSubmitting`, and parses modified args with a
  shell-style tokenizer instead of splitting on whitespace.
- I13: replaces the `@elastic/eui/src/...` deep import with a type
  derived from EUI's published `onChange` prop signature.
- I14: wraps `EmulationBadge` in `EuiToolTip` for keyboard /
  screen-reader users.

Cleanup
- I8/I17/N1/N2: deletes the unused `logInjection` flag, prunes dead
  `audit_logger` helpers, and slims the allowlist / rate-limiter APIs
  to the methods actually used.
- I15: adds CODEOWNERS entries for `common/detection_emulation`,
  `public/detections/components/emulation`, and the
  `agent_builder/skills/detection_emulation` directory.

Tests
- N8: `EmulationBadge` test asserts via `data-test-subj`, not
  classNames.
- Route, schema, skill, and component tests updated to cover the new
  gates (auth, idempotency, rate limit), the typed errors, the
  discriminated union, and the new tooltip / tokenizer behavior.
- Full pre-commit pass: `type_check.js` clean, eslint clean on all
  changed files, 123/123 jest specs green across schema, server,
  agent-builder skill, and component suites.
Adds payloads/payloads.json with 12 entries covering ATT&CK techniques:
T1059.001, T1059.003, T1059.004, T1218.005, T1218.011, T1053.005,
T1547.001, T1057, T1003.001, T1070.004, T1071.001, T1112.

Each entry is typed { techniqueId, name, agentTypes[], command,
parameters, expectedSignals[] }. Payloads use self-cleaning shell
commands where possible to minimise post-emulation artifacts. T1057
(process discovery) uses `running-processes` and lists all 4 supported
agent types; all other entries use `execute` and are scoped to `endpoint`
which is the only agent type with execute support wired today.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds payloads/index.ts exporting:
- EmulationPayload interface typed against ResponseActionAgentType and
  ResponseActionsApiCommandNames (import type — no runtime coupling to
  common constants).
- payloadLibrary: readonly EmulationPayload[] loaded from payloads.json.
- PAYLOAD_LIBRARY_MAX_ENTRIES = 15 governance constant.
- findByTechniqueIds(ids): uses a Set for O(1) lookups; preserves
  library insertion order; deduplicates repeated IDs in the input.

Adds payloads/index.test.ts with 26 jest assertions covering:
- Hard-cap enforcement (toBeLessThanOrEqual PAYLOAD_LIBRARY_MAX_ENTRIES).
- Shape validation: non-empty techniqueId/name, valid agentTypes,
  valid commands, at least one expectedSignal, unique techniqueIds.
- Wave-1 technique coverage (it.each over all 12 required IDs).
- findByTechniqueIds edge cases: empty input, no-match, single match,
  multi-match, unknown+known mix, order preservation, deduplication.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…type

Adds emulation_report_type.ts with:
- emulationReportType (SavedObjectsType<EmulationReportAttributes>):
  hidden=true, hiddenFromHttpApis=true, namespaceType=multiple-isolated,
  stored in SECURITY_SOLUTION_SAVED_OBJECT_INDEX.
- EmulationReportAttributes interface covering all 14 fields from the
  spec: scenarioId, ruleId, scenarioFingerprint, mode, endpointIds,
  agentType, startedAt, completedAt, payloadIds, dispatchedActions[],
  score, perPhase[], operator, spaceId.
- modelVersions baseline '1' with forwardCompatibility + create schemas
  (unknowns: 'ignore' on forward compat to allow future additive fields).
- ES mappings: dynamic: false; score fields use float/integer; array
  fields (endpointIds, payloadIds, signals) mapped as keyword multi-value.

Wires emulationReportType into saved_objects.ts types[] and exports it
from lib/detection_emulation/index.ts alongside the existing
emulationRuleBindingType.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds detectionEmulationLogInjection: false alongside the existing
detectionEmulationRealExecution flag. When true, the validateRule
pipeline uses log injection (synthesised ECS documents) instead of
dispatching real response actions to endpoints. Gating on a separate
flag keeps the two dispatch modes independently toggleable and lets
log injection ship before real execution is broadly available.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n and validation keys

Adds two new optional sub-objects under xpack.securitySolution.detectionEmulation:

detectionEmulation.logInjection:
  indexTemplateName  (default: '.kibana-security-emulation-logs')
    Base name for the ILM-managed index template; runtime appends
    '<spaceId>-*' to form the full pattern.
  retentionDays      (default: 7, min: 1)
    ILM delete phase for synthesised ECS documents.

detectionEmulation.validation:
  wallBudgetMsDefault  (default: 60 000 ms, min: 1 000)
    Default telemetry-collector timeout per validateRule run.
  wallBudgetMsMax      (default: 300 000 ms, min: 1 000)
    Hard ceiling for budget values accepted from API callers; requests
    above this are clamped, preventing runaway long-poll connections.

Both sub-objects follow the existing schema.maybe(schema.object({...}))
pattern used by allowlist/rateLimiter/idempotencyCache — the whole group
is optional, code null-coalesces to baked-in defaults.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
patrykkopycinski and others added 17 commits May 27, 2026 11:56
…: Update smoke spec findings shape to match spec: canRead + indexCount

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… notification path for risk #3 in README

Adds a new section covering the audit SO fields (actor.kind, scenarioFingerprint
SHA-256), Kibana security audit log integration, and three SOC tooling
consumption patterns (Kibana rule, Watcher, Filebeat/Fleet pipeline).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…de role definition guidance for risk elastic#16

Documents how to run the discovery probe, the expected access surface for
built-in ES roles (superuser + kibana_system as known residuals), and four
least-privilege mitigations for operator-defined roles: no wildcard .kibana*
grants, CCS pattern splitting, DLS match_none filter, and ES-layer audit logging.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…date risk HTML: hero counts, rows #3-elastic#16, safeguards

- Hero: Medium 2→1 (row #9 demoted to Low by event.dataset stamp); safeguards 21→23
- Row #3: mitigated note references OOB notification path documented in README
- Row #4: permanent note updated — execute curatedOnly + allowedExecuteCommandPatterns
- Row #9: Medium/Scheduled → Low/Mitigated (event.dataset + event.module stamp)
- Row elastic#14: mitigated note updated — curatedOnly now covers upload (closed short-circuit)
- Row elastic#16: Low/Scheduled → Low/Mitigated (discovery probe + README operator guidance)
- Active safeguards: updated curatedOnly bullet (execute+upload), added execute-regex
  gate bullet, added event.dataset stamp bullet; intro updated to "Twenty-three controls"
- Roadmap: #9 and elastic#16 rows updated to note shipped vs follow-up split

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ixes to changed detection_emulation files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nst undefined

Partial configs (e.g. from older tests or forward-compat reads) omit the new
field; use `?? []` so the length check never throws on undefined. The required
interface still enforces the field at the TS layer for new call-sites.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e inspection in index_access smoke spec

Rewrites the index_access smoke spec (Risk elastic#16) to use `_security/role`
definition inspection instead of temporary user creation + privilege checks.

The original run_as / per-role-client approach requires the cluster to have
a master node for write quorum (putUser is a write operation). Role definition
inspection is fully read-only and works on any cluster state. Results match
the expected access surface documented in the README: superuser + kibana_system
have read access; all other built-in roles do not.

Other improvements:
- SerializeError helper includes HTTP status code for non-200 ES responses
- `create_index` field renamed to `createIndex` (camelCase, naming-convention)
- `fleet_server` 404 surfaces as "404: {}" for clarity (role absent in ES 9.5)
- Test runs in <200ms instead of timing out at 120s

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Trim validateRule tool description from ~1800 to ~400 tokens for OSS
  model compatibility (pipeline details already live in skill content)
- Shorten endpointIds .describe() to 2 concise sentences
- Align skill content tool references to use actual registered IDs
  (security.detection-emulation.validate-rule, .get-history,
  .run-process-command, etc.) instead of informal shorthand names
- Change agentType from z.literal('endpoint') to z.enum(['endpoint'])
  across all 5 tool schemas for smoother future vendor extension
- Add 3 additional distractor examples to the eval dataset (ES|QL
  question, threat hunting request, dashboard creation) bringing total
  distractors to 5 per skill-dev-plugin guidance
Centralizes ~20 manually-constructed error responses into a single
`emulation_tool_errors.ts` module with type-safe factory methods for
each error class (featureDisabled, authorizationError, rateLimitExceeded,
invalidParameters, userDeclined, validationGateBlocked, scenarioFailure,
concurrencyExceeded, executionError, etc.).

Updates withCommandGates, validateRule, and all 4 per-family run*Command
tools to use the shared builder instead of inline ToolResultType.error
constructions. Removes redundant ToolResultType imports from per-family
tools.
The REST route's idempotency cache prevented double-dispatch from
network retries, but the Agent Builder tool dispatch path (via
withCommandGates) lacked this protection. LLM retries or framework-level
transient-error retries could fire a second response action.

Threads the idempotencyCache from DetectionEmulationGuardrails into all
4 per-family tools → withCommandGates context. Checks the cache after
the allowlist gate (matching REST ordering) and writes back on both
success and error paths so replays get the cached result.
Creates gate_checks.ts with protocol-agnostic gate functions:
- checkRealExecutionFeatureFlags / checkModeFeatureFlags
- checkValidation (curated-only + allowedScriptIds)
- checkRbac (per-command RBAC via EndpointAuthz)
- resolveEffectiveConfig (reads Advanced Settings per-space)
- checkAllowlist (host allowlist)
- acquireRateLimit (atomic per-space + per-host rate limit)
- checkAuth (authenticated caller check)

Each gate returns a typed GateResult<T> (ok/fail) with structured
metadata. withCommandGates now composes from these primitives instead
of inline logic — single source of truth for each gate check that
both the tool dispatch and REST route can share.
Refactors the run_command REST route to import and call the shared gate
check functions (checkRealExecutionFeatureFlags, checkAllowlistGate,
acquireRateLimitGate) instead of duplicating the logic inline.

The route still handles protocol translation (GateResult → HTTP
response via siemResponse) and route-specific concerns (i18n messages,
Kibana request context), but the gate logic itself is now single-sourced
from gate_checks.ts.
Creates createRunFamilyCommandTool factory that builds the schema,
confirmation, and handler from a FamilyToolConfig object. All four
per-family tools (process, file, network, execution) are now config-only
modules (~50 lines each) delegating to the factory.

Eliminates ~400 lines of duplicated handler/schema/confirmation logic.
Adding a new family (e.g. registry) is now a one-file, config-only
addition.
Adds optional savedObjectsClient to CommandGatesContext and the factory
handler destructure. When provided by the Agent Builder handler context,
withCommandGates uses it directly for uiSettingsClient derivation rather
than re-creating a scoped client via coreStart.savedObjects.

This eliminates a redundant getScopedClient call (async hop) on every
tool invocation while keeping backward compat (falls back to
request-scoped creation when the field is absent).
Relocates resolve_current_user.ts from the skill-specific directory
to server/lib/detection_emulation/ so it's importable from both the
Agent Builder tool handlers and the REST routes without a cross-concern
import path.

Previously flagged in the code itself as "should be upstreamed" — this
is the short-term path (shared within the plugin) while awaiting an
export from @kbn/agent-builder-server.
- Remove unused DetectionEmulationFeatureFlags type import from
  gate_checks.ts
- Migrate get_emulation_history_tool.ts to use the shared toolError
  builder instead of inline ToolResultType.error construction
… Workflows stack

- Wire shared gate_checks into validate_rule_tool.ts (replaces ~80 lines of inline gates)
- Delete run_command REST route + tests (-1,108 lines) — tool is the single implementation
- Add traced logger (createTracedLogger) to createRunFamilyCommandTool factory
- Wrap async operations in withCommandGates with runStep for timing/error attribution
- Remove DETECTION_ENGINE_EMULATION_RUN_COMMAND_URL constant
- Add execution modules: traced_logger, pipeline_step_error, tool_factory_deps, validate_pre_execution

Patterns adopted from PRs elastic#260739, elastic#260744, elastic#260793, elastic#260811.
@patrykkopycinski patrykkopycinski force-pushed the ao/detection-emulation-skill-4de85a branch from 2044677 to f8dd99c Compare May 27, 2026 19:58
patrykkopycinski and others added 11 commits May 27, 2026 22:07
- Remove DEMO_GUIDE.md, openspec/, .playwright-mcp/ (not part of this PR)
- Remove dead functions: buildEmulationModeQuery, extractEmulationMetadata, isEmulationAlert
- Clean up corresponding test cases and unused imports
- Fix runtime crash: featureFlags undefined on log_injection path
- Move gate_checks.ts from agent_builder/skills/ to lib/execution/
  (fixes circular dependency: lib/ was importing from agent_builder/)
- Remove `as any` casts for coreStart.security (unnecessary)
- Remove empty ValidateRuleToolDeps interface (use ToolFactoryDeps)
- Remove DEMO.md files and production-risk-analysis.html
…on Agent Builder skill

Three evaluators per example:
- toolSelection (createSkillInvocationEvaluator): APM trace check for
  SKILL.md filestore.read span — verifies skill activation
- schemaCompliance (createTraceBasedEvaluator): ES|QL query over traces
  asserting every validate-rule call includes ruleId + endpointIds
- criteria (DefaultEvaluators.criteria): per-example LLM judge

9 examples: 2 success paths (T1059.001, T1218.005), default mode,
history-first flow, 3 failure modes, 2 distractors.

Includes .eslintrc.js boundary-crossing exemption so *.spec.ts can
import the devOnly @kbn/evals package without affecting the plugin build.
…rver/agent_builder/skills/detection_emulation/evals/validate_r

Auto-committed by patryks-treadmill orchestrator.
plan=detection-emulation-skill-epic-15974-orchestration-layer job=f23206c3-0fe4-461c-b545-e2f737b7f735 attempt=1
… orchestrator

- Extract runRuleExecutors from route.ts (already committed)
- Clean up unused imports in route.ts (SERVER_APP_ID, RuleExecutionStatusEnum, alertInstanceFactoryStub)
- Add optional rulePreviewDeps to OrchestratorOptions for rule preview validation
- Add Step 8 (Rule Preview Validation) to scenario orchestrator pipeline
- Add rulePreviewValidation to OrchestratorResult
- Update DESIGN.md with Section 9: Implementation Status & Gap Analysis
…e demo screenshot showing Agent Builder transcript with succes

Auto-committed by patryks-treadmill orchestrator.
plan=detection-emulation-skill-epic-15974-orchestration-layer job=49341029-00d6-4a17-af8c-300aa7d3f419 attempt=2
Brings the plugin-side eval files into parity with the canonical
kbn-evals-suite-detection-emulation suite:

Dataset (validate_rule_dataset.ts):
- Add `tool_sequence` field to all examples (consumed by trajectory
  evaluator for LCS-based order scoring; `[]` for distractor examples
  so the evaluator returns 1.0 when no tools fired)
- Add `autoConfirm` field to the HITL `userDeclines` example
- Add HITL example: user declines real_execution prompt → `user_declined`
- Add 3 extra distractor examples: ES|QL question, threat hunting,
  dashboard creation (total 13 examples, up from 9)

Spec (validate_rule.spec.ts):
- Replace `p-retry` with `withRetry` from @kbn/evals (consistent with
  canonical suite; N5 tracking)
- Add HITL auto-resume loop in DetectionEmulationChatClient.converse:
  polls `response.prompts`, responds with the per-example `autoConfirm`
  policy, bounded by MAX_PROMPT_ROUNDS=5
- Add `createValidateRuleTrajectoryEvaluator` (createTrajectoryEvaluator
  with orderWeight=0.7 / coverageWeight=0.3) applied to every example
- Wire `autoConfirm` policy from `example.input.autoConfirm` into each
  `runScenario` call
- Register the 4 new evaluate() blocks matching the 4 new examples

Three required evaluators per spec §8 remain: toolSelection (renamed
from createToolSelectionEvaluator → same createSkillInvocationEvaluator
underneath), schemaCompliance, criteria. Trajectory is additive.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…skill

Adds DEMO.md alongside the existing README.md covering:
- Feature flag setup in kibana.dev.yml (detectionEmulationLogInjection,
  detectionEmulationRealExecution) and full optional runtime config keys
- Step-by-step Agent Builder UI walkthrough (happy path, history-first,
  real execution, and failure cases)
- Full ValidationReport response field reference with inline comments
- Typed error response table (error_type, HTTP equivalent, trigger)
- Dev Tools queries for inspecting injected log-injection documents
- Saved Objects Find API snippet for browsing emulation history
- Troubleshooting table for common failure modes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements full ES|QL query inversion using @elastic/esql AST parser.
Extracts field constraints from WHERE clauses to generate matching
synthetic log documents.

Supported operators: ==, !=, >, >=, <, <=, LIKE, RLIKE, IN,
IS NULL, IS NOT NULL, AND, OR, NOT.

For aggregating queries (STATS ... | WHERE threshold), only the
first WHERE clause is inverted — threshold filters operate on
aggregation results, not document fields.

- New esql_inverter.ts module with extractEsqlConstraints()
- Wired into query_inverter.ts dispatcher (language === 'esql')
- 26 unit tests covering all operators, edge cases, and real-world patterns
- Lucene remains as the only graceful-degradation language
@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

Observation: EmulationRunner + withCommandGates duplicate ResponseActionsClient capabilities

While reviewing the shared infrastructure for the upcoming Endpoint Response Actions Skill (#17508), I noticed the emulation skill rebuilds several capabilities that BaseResponseActionsClient already provides on main. Consolidating would reduce ~300 lines and prevent the two paths from diverging.

What BaseResponseActionsClient already handles

Capability Location What it does
RBAC (per-command) validateRequest()isActionSupportedByAgentType() Checks command is supported for agent type + action mode (manual/automated)
Enterprise license gate validateRequest()getLicenseService().isEnterprise() Blocks automated actions without Enterprise license
Space-scoped agent validation validateRequest()fetchAgentPolicyInfo() Validates agents are in the active space
Audit trail writeActionRequestToEndpointIndex() Writes to .logs-endpoint.actions-* with full attribution
Cases attachment updateCases() Attaches action to Security cases
Telemetry sendActionSentTelemetry() / sendActionResponseTelemetry() Reports ENDPOINT_RESPONSE_ACTION_SENT_EVENT
Action expiration getActionRequestExpiration() Sets TTL on action requests
Dispatch (all 12 commands) EndpointActionsClient.isolate(), .execute(), etc. Typed dispatch via Fleet actions API

What the emulation skill re-implements

Gate in withCommandGates / gate_checks.ts Overlap with client
checkRbac() — maps command → RESPONSE_CONSOLE_ACTION_COMMANDS_TO_REQUIRED_AUTHZendpointAuthz[key] validateRequest() already checks isActionSupportedByAgentType(agentType, command, actionType). The emulation RBAC gate adds a finer-grained check (per-console-command privilege), which the base client doesn't do — this is a genuine addition
checkAuth() — resolves username via _security/_authenticate The client constructor takes username as a required param — the caller already resolved it. The emulation skill re-resolves because of the Task Manager fakeRequest issue, but for interactive Agent Builder calls the request already carries the user
EmulationRunner.dispatch() — exhaustive switch over all 12 command types EndpointActionsClient has the identical switch — .isolate(), .execute(), .killProcess(), etc. The runner is a pass-through wrapper
EmulationRunner.createResponseActionsClient()getResponseActionsClient(agentType, opts) One-liner factory call, same as what any consumer does

What's genuinely emulation-specific (should stay)

These are NOT in the base client and rightfully belong in the emulation layer:

  • EmulationAllowlist — operator-controlled host allowlist (Advanced Settings)
  • EmulationRateLimiter — per-space (100/h) + per-host (3/h) sliding windows
  • EmulationIdempotencyCache — dedup retried LLM tool calls
  • checkRealExecutionFeatureFlags() — emulation-specific feature flag + runtime kill switch
  • checkValidation() — curated-only mode, allowedExecuteCommandPatterns regex allowlist, allowedScriptIds
  • buildEmulationComment() — audit attribution with conversationId/runId/toolCallId/SHA-256 prompt hash
  • EmulationRunner.resolveRuleBinding() — rule context lookup for emulation actions

Suggested simplification

The EmulationRunner class could be reduced to a thin wrapper that:

  1. Resolves the rule binding (emulation-specific)
  2. Builds the emulation comment with actor attribution (emulation-specific)
  3. Calls client.isolate() / client.execute() / etc. directly — no dispatch switch needed

The exhaustive dispatch() switch duplicates the typed interface that ResponseActionsClient already enforces. If a new command is added, it needs to be added in both places today — single point of truth would be better.

// Before: EmulationRunner.dispatch() has 12-case switch
// After: direct client call
const client = getResponseActionsClient('endpoint', constructorOptions);
const actionDetails = await client[commandMethodMap[input.command]](request, options);

The withCommandGates pipeline is valuable — the emulation-specific gates (allowlist, rate limiter, idempotency, feature flags, validation) are genuine additions. But the RBAC check + auth check + dispatch could lean on the client directly rather than re-implementing them.

This isn't blocking — the current implementation works and is well-tested. But when the Response Actions Skill (#17508) ships, it will use ResponseActionsClient directly (no runner, no custom dispatch switch). Having two patterns for the same underlying dispatch in the same codebase will cause confusion about which one to use for future response-action surfaces.


Context: this came up while planning the shared infrastructure between the detection emulation skill and the upcoming endpoint response actions skill. The response actions skill will be ~50 lines of tool handler code because it delegates everything to ResponseActionsClient + getActionDetailsById() directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants