Add find-security-rules skill for the agent builder#269089
Add find-security-rules skill for the agent builder#269089nkhristinin wants to merge 21 commits into
Conversation
Skill review: findings + lessons capturedThanks @nkhristinin — this skill surfaces several reusable patterns I haven't seen all together before, and a handful of issues that turn out to be reusable lessons too. Both have been formalized into ✅ Strong patterns now codified for re-useThese nine patterns are now
If the next person picking up a similar skill learns nothing else from this PR, they should pick up A, E, and H. Findings — fix before merge🔴 Blocker —
|
When no security.rule attachment is present (e.g. the user selected a rule from find-rules output), the skill now directs the agent to call resolve_rule_attachment first, render the result, then proceed with the normal diagnostic branches. Also updates the skill description to reflect that the attachment is no longer a prerequisite for triggering the skill, and adds a TODO comment marking where SECURITY_FIND_RULES_TOOL_ID should be added to getRegistryTools once elastic#269089 lands.
- Drop 3 conditions (mitreTactic, indexPattern, riskScoreMax) from rule_filter.ts - Remove verbose fields from summarizeRule (ruleTypeId, index, threat, interval, createdAt) - Shorten skill and tool descriptions - Rewrite eval expected outputs to be data-agnostic with rule details - Add tool_sequence annotations to all eval examples - Add pre-seed cleanup to fixtures preventing leftover contamination - Add distractor suite (6 examples) and multi-turn conversation test - Revert evaluate_dataset.ts to main (no shared file changes) Eval results (Sonnet, 22 examples): Groundedness: 0.92 Sequence Accuracy: 0.97 ToolUsageOnly: 1.00 Relevance: 0.54 Factuality: 0.18
Rewrite 16 expected outputs to comprehensively cover LLM response claims: add rule types, explicit counts, tool usage patterns, and per-rule severity/risk/enabled details. Factuality 0.19 → 0.65.
… custom filter DSL Replace AND/OR group filter language with flat params delegating to the existing convertRulesFilterToKQL(). Simplify discover_rule_tags to accept no parameters. Delete rule_filter.ts and consolidate buildToolFilter into find_rules_tool.ts. Use CreatedRule interface in eval fixtures.
…port - Rename skill id/name from 'find-rules' to 'find-security-rules' for disambiguation with non-detection rule types - Handle tags with OR semantics instead of delegating to convertRuleTagsToKQL (which uses AND), matching the documented behavior - Fix stale buildFullFilter export in index.ts (now buildToolFilter) - Update eval expectedSkill references
The skill id was renamed but the server-side allowlist still had the old name, preventing the skill from loading.
…re cleanup - Rename nameContains to searchTerm to reflect that convertRulesFilterToKQL searches name, index patterns, and MITRE fields - Use EXPECTED_MAX_TAGS constant instead of hardcoded 500 for tag aggregation - Add isAllowedBuiltinSkill test to prevent allowlist drift - Fixture cleanup now scopes to fixture rule names instead of deleting all rules/alerts - Remove tool_sequence from eval metadata, fix stale eval examples
653676f to
c22723f
Compare
|
@elasticmachine merge upstream |
|
There are no new commits on the base branch. |
|
@elasticmachine merge upstream |
| searchTerm: z | ||
| .string() |
| .optional() | ||
| .describe('Rule types to include (OR). E.g. ["query", "eql"].'), | ||
| tags: z | ||
| .array(z.string().min(1)) |
| 'Exact tag values to include (OR). Discover values first via `security.discover_rule_tags`.' | ||
| ), | ||
| excludeTags: z | ||
| .array(z.string().min(1)) |
| mitreTechnique: z | ||
| .string() |
| mitreTactic: z | ||
| .string() |
| ruleId: z | ||
| .string() |
|
@elasticmachine merge upstream |
| This skill is read-only. Never suggest or offer to enable, disable, edit, delete, duplicate, or bulk-edit rules. Do not prompt the user to take any action on the rules returned. If the user asks to modify a rule, direct them to the Detection Rules UI.`, | ||
| getRegistryTools: () => [SECURITY_ALERTS_TOOL_ID], | ||
| getInlineTools: () => [ | ||
| createFindRulesInlineTool({ getStartServices, logger }), |
There was a problem hiding this comment.
a question, no change needed here: you created these tools inline here because you dont want them showing up in the registry?
|
@elasticmachine merge upstream |
💔 Build Failed
Failed CI Steps
Test Failures
Metrics [docs]Unknown metric groupsESLint disabled line counts
Total ESLint disabled count
History
|








Summary
Adds the find-security-rules skill to Security AI Assistant's Agent Builder integration. This skill enables natural-language rule discovery queries (listing, filtering, counting, sorting detection rules) via two inline tools:
security.find_rules— lists, filters, sorts, and counts detection rules using flat parameters (searchTerm,enabled,ruleSource,severity,ruleType,tags,excludeTags,mitreTechnique,ruleId,sortField,sortOrder,perPage). Delegates to the existingconvertRulesFilterToKQL()for base filtering.security.discover_rule_tags— discovers all available rule tag values (no parameters). Must be called before any tag-based filtering to avoid hallucinated tag names.The skill also references the existing
security.alertsregistry tool for noisy-rules queries that correlate alert volume with rule metadata viakibana.alert.rule.rule_id.Changes
find_rules_skill.ts,find_rules_tool.ts,discover_rule_tags_tool.tsEval scores (Sonnet 4.6, 22 examples)
Rule discovery (16 examples):
Distractor routing (6 examples) and multi-turn (1 example) also pass. Distractor Factuality is expectedly low (0.16) — those examples test routing away from the skill, so the expected outputs are vague intent statements that the strict claim-by-claim scorer penalizes.
Trace-based evaluators (Latency, Token counts, Skill Invoked) require a trace ES endpoint and are not reported here.
Test plan