[Nightshift] Add panels, flyouts, and detail components by smith · Pull Request #271281 · elastic/kibana

smith · 2026-05-26T14:13:28Z

Summary

Ports the presentation layer from the kbn-sigevents package (PR #264555 branch) into kbn-nightshift on main:

NightshiftOverview — main panel with critical/warning/healthy states, impacted cards, and metric widgets
Event detail flyouts — SignificantEventDetailBody, SignificantEventDetailHeader, LowerPriorityEvents, OtherPromotedEvents with expandable flyout views
Supporting UI — CriticalityDonut, DependencyChainMap (ReactFlow), InfoPanel, RootCausePanel, RecommendationsPlanPanel, StatusHeader/Banner, ImpactedCard, MetadataIconCard
Data hooks — useFetchLatestSignificantEvent, useFetchSystemOverview, useFlyoutFocusManagement
149 tests across 18 test suites, plus Storybook stories for all components

Agent builder chat features are excluded. The AiButton "Remediate" buttons remain as callback props (onRemediate) that consumers can wire to whatever backend they choose.

The SignificantEventDocument type (previously imported from @kbn/observability-agent-builder-plugin) is now defined locally in types/significant_event_document.ts.

Test plan

tsc --noEmit passes with no nightshift-specific errors
All 149 jest tests pass (jest --config kbn-nightshift/jest.config.js)
Pre-commit lint checks pass
Verify Storybook renders (yarn storybook kbn-nightshift)
Verify the nightshift page loads in the browser at /app/observability/nightshift

…hift Port the presentation layer from the kbn-sigevents package (PR elastic#264555) into kbn-nightshift on main. This includes the overview panel, event detail flyouts, all supporting UI components, data-fetching hooks, tests, and Storybook stories. Agent builder chat features are excluded — the AiButton remediate callbacks remain as wirable props without any agent builder dependency.

infra-vault-gh-plugin-prod · 2026-05-26T14:13:52Z

🤖 Jobs for this PR can be triggered through checkboxes. 🚧

ℹ️ To trigger the CI, please tick the checkbox below 👇

Click to trigger kibana-pull-request for this PR!
Click to trigger kibana-deploy-project-from-pr for this PR!
Click to trigger kibana-deploy-cloud-from-pr for this PR!
Click to trigger kibana-entity-store-performance-from-pr for this PR!
Click to trigger kibana-storybooks-from-pr for this PR!

kibanamachine · 2026-05-26T14:31:17Z

💔 Build Failed

Buildkite Build
Commit: 47608b7
Storybooks Preview
Build duration: 26 mins

Failed CI Steps

Quick Checks

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`observability`	1817	2051	+234

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`observability`	2.1MB	2.3MB	⚠️ +279.3KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`observability`	102.1KB	102.1KB	+18.0B

History

The View Details button on the main significant event card was not opening a flyout. Add internal flyout state management to NightshiftOverview (matching the pattern used by OtherPromotedEvents) and wire up the NightshiftPage to pass real data from useFetchLatestSignificantEvent.

Closes elastic#259251 ## Summary - Pins Console HTTP method completion ordering with explicit `sortText` values so Monaco does not fall back to alphabetical label sorting. - Keeps the existing method set unchanged while ordering `GET` first and `DELETE` last. ## Root Cause - Monaco uses `sortText` for completion ordering and falls back to the item label when `sortText` is missing, which can put `DELETE` before safer/default verbs. ## Fix - Assign stable `sortText` values to method completion items based on the intended canonical order. - Add a focused unit test that sorts method suggestions the same way and verifies `GET` is first and `DELETE` is last. ## Before <img width="723" height="466" alt="image" src="https://github.com/user-attachments/assets/faf244b2-4207-483b-acbc-32b148441b18" /> ## After <img width="725" height="437" alt="image" src="https://github.com/user-attachments/assets/782c0c60-6052-4c28-80bc-f45403fa1383" /> ## Test Plan - `node scripts/jest --config=src/platform/plugins/shared/console/jest.config.js src/platform/plugins/shared/console/public/application/containers/editor/monaco_editor_actions_provider.test.ts` — passed. - `node scripts/check_changes.ts` — passed. ## Release Note - Fixes Console autocomplete so `GET` is shown before `DELETE` when suggesting HTTP methods on an empty request line. Assisted with Cursor using GPT-5.5 Made with [Cursor](https://cursor.com) Co-authored-by: Cursor <cursoragent@cursor.com>

## Summary Limiting flaky test runner to 50x runs per config. We should be mindful with CI costs related to flaky tests investigation and 50 runs is more than enough to confirm the fix.

…tic#270775) ## Summary Fixes the flaky `fullscreen.spec.ts` Scout test ("should interact with metrics in fullscreen mode"). **Two failure modes were identified:** 1. **Chrome header interception** (~65 failures, Apr 3–18): Already fixed by PR elastic#264932 which added `chromeHeader.waitFor({ state: 'hidden' })` to `toggleFullscreen()`. No recurrence since. 2. **`viewDetails` action not found** (2 failures, May 20): After `clearSearch()`, the grid re-renders with the full unfiltered metric set. The test immediately opened the context menu without waiting for the grid to settle — the Lens embeddable hadn't finished re-mounting and registering its actions. **Fix:** Add `await expect(metricsExperience.pagination.container).toBeVisible()` after `clearSearch()` to wait for the grid to finish re-rendering before opening the context menu. This matches the pattern used in `grid.navigation.spec.ts`. Closes elastic#261199 ## Test plan - [ ] Run `node scripts/scout run-tests --arch stateful --domain classic --testFiles src/platform/plugins/shared/discover/test/scout/ui/parallel_tests/metrics_experience/fullscreen.spec.ts` locally - [ ] Verify CI passes on stateful and serverless targets 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Cursor <cursoragent@cursor.com>

* Introduces a new CodeQL rule to alert on API routes that use either `@kbn/config-schema` or `zod` in a way that allows for strings of unbounded length. This is a companion to our existing CodeQL rule that alerts on unbounded arrays. * Introduces a [file exclusion pattern](https://github.com/legrego/kibana/blob/87f43b9bc3bc1a78b718cbb845861de32e53c3e7/.github/codeql/custom-queries/dos/KibanaDoSExclusions.qll) for both of our DoS CodeQL rules to allow us to more systematically omit files from alerting on these findings, if it is clear that their usage of these validation schemas is not for the purpose of API route validation. --------- Co-authored-by: Elena Shostak <165678770+elena-shostak@users.noreply.github.com>

…r named OAS components (elastic#270983) ## Summary Closes elastic#263711 Adds `meta: { id: '...' }` to every `schema.object()` call used as a request body or response body in the maintenance window plugin. This causes the OAS generator to emit named `$ref` components (e.g. `Kibana_HTTP_APIs_maintenance_window_response`) instead of inlining the full schema at every endpoint that uses it. ## What changed and why ### Schema files (10 files) `meta: { id }` was added **only to body/response schemas** — never to path params or query params, which would throw a runtime error. All IDs use a `maintenance_window_` prefix to avoid collisions with existing named components from the alerting plugin (`r_rule_response`, `alerts_filter_query`, `schedule_request` already exist there). > **Note on internal schemas:** `meta: { id }` was intentionally not added to internal route schemas (`/internal/maintenance_window/*`). The OAS capture script filters exclusively for `access: 'public'` routes, so internal schemas never appear in the generated `kibana.yaml` / `kibana.serverless.yaml` — adding IDs there would have no effect. > **Note on diff size:** The diff looks larger than the actual change. To add a second argument `{ meta: { id } }` to `schema.object()`, the single-argument form `schema.object({...})` must be restructured into two arguments, which re-indents everything inside. The only real additions are the 13 `meta: { id }` lines — one per schema object. **External API schemas** (`/api/maintenance_window`): | File | ID added | |------|----------| | `external/request/create/schemas/v1.ts` | `new_maintenance_window` | | `external/request/update/schemas/v1.ts` | `update_maintenance_window` | | `external/request/find/schemas/v1.ts` | `find_maintenance_windows_response` | | `external/response/schemas/v1.ts` | `maintenance_window_response` | | Nested scope object (create/update/response) | `maintenance_window_scope` | | `schedule/schema/v1.ts` (request) | `maintenance_window_schedule_request`, `maintenance_window_schedule_recurring_request` | | `schedule/schema/v1.ts` (response) | `maintenance_window_schedule_response`, `maintenance_window_schedule_recurring_response` | **Shared schemas** (used by both external and internal routes): | File | ID added | |------|----------| | `r_rule/request/schemas/v1.ts` | `maintenance_window_r_rule_request` | | `r_rule/response/schemas/v1.ts` | `maintenance_window_r_rule_response` | | `alerts_filter_query/schemas/v1.ts` | `maintenance_window_alerts_filter_query` | ### OAS output files (2 files) `oas_docs/output/kibana.yaml` and `oas_docs/output/kibana.serverless.yaml` were regenerated using the same capture command CI uses: ```sh node scripts/capture_oas_snapshot.js \ --include-path /api/status \ --include-path /api/alerting/rule/ \ --include-path /api/alerting/rules \ --include-path /api/actions \ --include-path /api/security/role \ --include-path /api/spaces \ --include-path /api/streams \ --include-path /api/fleet \ --include-path /api/saved_objects \ --include-path /api/maintenance_window \ --include-path /api/agent_builder \ --include-path /api/workflows \ --include-path /api/dashboards \ --include-path /api/visualizations \ --include-path /api/security/entity_store cd oas_docs && make api-docs ``` The output now references named components like `$ref: '#/components/schemas/Kibana_HTTP_APIs_maintenance_window_response'` instead of inlining the full schema object at every endpoint. ## Checklist - [x] `meta: { id }` added only to `schema.object()` body/response schemas (not path/query params) - [x] All IDs prefixed with `maintenance_window_` to avoid collisions with alerting plugin components - [x] OAS output files regenerated with `make api-docs` - [x] New named components verified in `kibana.yaml` and `kibana.serverless.yaml` 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

@timestamp

…tic#270108) ## Summary closes elastic/streams-program#1339 This PR converts the remaining DSL-based read paths in the streams plugin (`FeatureClient`, `QueryClient`, `InsightClient`, and the significant-events alerts reader) over to ES|QL via `storageClient.esql`. It is the last step of a broader effort to standardise local-index reads on ES|QL inside streams; earlier PRs migrated the other read paths in the same area. ### What changes - **New `IStorageClient.esql` method** on `kbn-storage-adapter`. Calling it goes through the same read pipeline as `search` / `get`: mapping bootstrap (`ensureMappingsBeforeReading`), graceful empty results when the index doesn't exist yet, and optional `maybeMigrateSource` on the `_source` column. All ES|QL reads in this PR go through it. - **`FeatureClient` and `QueryClient`** — every DSL `search` / `get` call used for listing, fetching, filtering, and keyword-search of knowledge indicators (features + queries) is replaced with an equivalent ES|QL query. Behaviour is preserved; the only externally observable change is that the queries are now expressible as ES|QL. - **Significant-events sparkline reader (`readSignificantEventsFromAlertsIndices`)** — previously a `date_histogram` aggregation against `.alerts-streams.alerts-default`, now a single ES|QL `STATS COUNT(*) BY rule_uuid, BUCKET(@timestamp, ?)` with client-side gap-filling so empty buckets still appear as zeros in the sparkline. The legacy `change_points` field on the response is kept as an empty stub for backwards compat with the existing consumer schema; its removal is tracked separately. - **Insight generation (`collectQueryData`)** — now issues two parallel ES|QL queries per rule (one for the total count, one for sample `_source` rows) instead of a DSL `search` with aggregations. Time bounds are passed as ISO-timestamp named params rather than relative `now-15m` expressions, which is more predictable when the request and execution clocks differ. - **`InsightClient` reads (`get`, `list`, `bulk` validation)** — migrated to `storageClient.esql`. One small behaviour fix: `list().total` now reflects the actual number of returned insights. Previously it was always `0` in production because the DSL search used `track_total_hits: false`. - **Shared helpers extracted** — `fillBucketGaps`, `parseBucketSize`, and `ESQL_UNITS` were duplicated between the alerts reader and `preview_significant_events.ts`. They now live in a new `sig_events/helpers/` module with unit tests. ### What is intentionally not changing Four hybrid/semantic search methods (`findFeaturesBySemantic`, `findFeaturesByHybrid`, `findQueriesBySemantic`, `findQueriesByHybrid`) remain on DSL. They rely on vector-search clauses that don't have a clean ES|QL equivalent today, and are deferred to follow-up issues: [streams-program elastic#1338](elastic/streams-program#1338), [streams-program elastic#1340](elastic/streams-program#1340). ### Storage Adapter I have added the `storageClient.esql` for ES|QL-based reads while keeping the storage adapter’s existing boundary: callers can define read semantics, but the adapter owns the backing storage index. To avoid introducing a raw cross-index escape hatch, the new method validates the ES|QL query before execution. It parses the query, requires a `FROM` command, and ensures every `FROM` index source targets the adapter’s own storage index. Invalid or out-of-scope queries fail before mapping bootstrap or Elasticsearch execution. --- *The PR was developed with Claude Code* --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>

…er in attacks discovery page (elastic#269352) ## Summary [See screenshot here](https://elastic.slack.com/archives/C08U04SUN49/p1778687779903229) - move the default popover position to be `upCenter` so taller action items can be seen <img width="1617" height="713" alt="image" src="https://github.com/user-attachments/assets/edd4a77e-412b-43a0-9dd9-b93e4bdc2435" /> ### Checklist Check the PR satisfies following conditions. Reviewers should verify this PR satisfies this list as well. - [ ] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels.

## Summary Adds a two-step agent skill workflow for end-to-end Security Solution bug fixing at `x-pack/solutions/security/plugins/security_solution/.agents/skills/`. This PR delivers the skill logic only. The skills are designed for interactive CLI sessions (Claude Code or Cursor) where a human drives each step. A follow-up to automate the workflow via GitHub label triggers is described at the bottom. --- ## Current limitations ### Serverless not supported (follow-up PR) This skill supports **stateful (ECH) environments only**. When the agent encounters a ticket for a serverless deployment it stops immediately and tells the user it cannot proceed — serverless reproduction requires a different server setup (different runner, different auth provider) that is not covered here. Serverless support is planned as a separate follow-up PR. --- ## Writing tickets the skill can act on The skill extracts reproduction steps, affected paths, feature flags, and deployment type directly from the GitHub issue. A ticket missing or vague on any of these will cause the agent to stop and ask for clarification — or worse, silently reproduce the wrong state. The checklist below is what the agent validates before moving past Phase 0. **Reproduction steps** — a specific navigation path with exact user actions, not a summary: | ✅ Good | ❌ Not enough | |---|---| | "Go to **Rules → Create rule → Select Threshold → scroll to Suppress alerts**" | "Go to the threshold rule form" | | "Click the rule row, switch to the **Exceptions** tab, add an exception with no conditions" | "Open a rule and add an exception" | **Current behavior** — a concrete observable symptom (error text, missing element, wrong value, network failure): | ✅ Good | ❌ Not enough | |---|---| | "The 'Optional' badge is absent next to 'Suppress alerts'" | "Something looks wrong with the rule form" | | "`POST /api/detection_engine/rules` returns 400 with `[value.index_pattern]: expected value of type [string]`" | "Rule creation fails" | **Expected behavior** — what the correct state looks like: | ✅ Good | ❌ Not enough | |---|---| | "'Suppress alerts' should show 'Optional' — as it does on Custom Query and EQL rule types" | "It should work correctly" | **Feature flags** — if the bug only appears behind an experimental flag, list the flag name explicitly. Omitting this causes the agent to fail to reproduce or reproduce a different code path entirely: ``` xpack.securitySolution.enableExperimental: [assistantModelEvaluation] feature_flags.overrides.some.flag: true ``` **Deployment type** — state explicitly whether the bug is on a stateful or serverless deployment. If omitted and the bug turns out to be serverless-only, the skill will discover this in Phase 0 and stop — but only after the Scout server has already started booting. Tickets that include all five items above get a `high`-confidence analysis and proceed to browser reproduction without interruption. Tickets missing reproduction steps or current behavior get `low` confidence and the agent asks for clarification before booting the server. --- ## How to use These skills live inside the Security Solution plugin directory, not at the repo root, so Claude Code does not auto-discover them. There are two ways to use them: ### Option A — Explicit invocation (no setup needed) Ask your agent (Claude Code or Cursor) to read the skill file directly: **Step 1 — reproduce:** > "Read and follow `x-pack/solutions/security/plugins/security_solution/.agents/skills/bug-reproduce/SKILL.md` for issue #NUMBER" **Step 2 — fix** (after reviewing the reproduction report): > "Read and follow `x-pack/solutions/security/plugins/security_solution/.agents/skills/bug-fix/SKILL.md`" ### Option B — Symlinks for auto-discovery **Claude Code** — symlink into `~/.claude/skills/` (personal skills directory). Run once: ```bash mkdir -p ~/.claude/skills ln -s /path/to/kibana/x-pack/solutions/security/plugins/security_solution/.agents/skills/bug-fixer ~/.claude/skills/bug-fixer ln -s /path/to/kibana/x-pack/solutions/security/plugins/security_solution/.agents/skills/bug-reproduce ~/.claude/skills/bug-reproduce ln -s /path/to/kibana/x-pack/solutions/security/plugins/security_solution/.agents/skills/bug-fix ~/.claude/skills/bug-fix ``` Replace `/path/to/kibana` with the absolute path to your local Kibana clone. Absolute paths are required here since the symlinks live outside the repo tree. **Cursor** — symlink into `.agents/skills/` at the repo root. Run once from the Kibana repo root: ```bash ln -s ../../x-pack/solutions/security/plugins/security_solution/.agents/skills/bug-fixer .agents/skills/bug-fixer ln -s ../../x-pack/solutions/security/plugins/security_solution/.agents/skills/bug-reproduce .agents/skills/bug-reproduce ln -s ../../x-pack/solutions/security/plugins/security_solution/.agents/skills/bug-fix .agents/skills/bug-fix ``` After adding symlinks, start a new session — skills are loaded at session start and won't appear in an already-running session. Once set up, you can invoke them with: > `/bug-reproduce #NUMBER` > `/bug-fix` > These symlinks are a local developer setup step — do not commit them. The skills themselves stay in the Security Solution plugin directory so they remain co-located with the code they operate on. Full workflow overview and prerequisites: `bug-fixer/SKILL.md` at the same path. --- ## Human interaction points The workflow has two mandatory human checkpoints. Everything else is fully automated. | Checkpoint | When | What the human does | |---|---|---| | **1 — Reproduction report** | End of Phase 3 | Read the report; reply to confirm the bug was reproduced correctly. The agent writes `user_acknowledged: yes` only after this reply. | | **2 — Fix plan approval** | Phase 4 Step 1 | Read the Root Cause Analysis and Fix Plan; explicitly approve before any code is written. | Two further interactions are optional: - **PR creation** (Phase 6) — agent asks whether to open a draft PR; you say yes or skip - **Review comments** (Phase 7) — triggered by reviewer activity, not the agent --- ## Skills | File | Purpose | |---|---| | `bug-fixer/SKILL.md` | Entry point: explains the two-step workflow and exact invocation commands | | `bug-reproduce/SKILL.md` | Step 1: ticket analysis, Scout server, browser reproduction, diagnostics | | `bug-fix/SKILL.md` | Step 2: fix plan approval, TDD fix, verification, optional draft PR | | `bug-fixer/KNOWLEDGE.md` | Cross-session knowledge base, updated after each fixing session | | `bug-fixer/references/classification-guide.md` | Bug patterns, test layer decision rules, fix strategies | | `bug-fixer/references/fix-workflow.md` | Root cause analysis template, SELF-CHECK questions | | `bug-fixer/references/baseline-failures.md` | 10 documented agent failures used to harden the skill rules | | `bug-fixer/references/knowledge-update.md` | Protocol for adding entries to KNOWLEDGE.md | | `bug-fixer/references/troubleshooting.md` | Known environment conditions (SAML redirect, AI Agent modal) | --- ## Design decisions and the challenges they solve ### Split into two separate skill invocations The most important architectural decision in this PR. **The problem:** In testing, the agent repeatedly skipped mandatory phases (browser reproduction, fix plan approval) when it identified what looked like an obvious fix from code analysis alone. We went through several rounds of increasingly strong language — "protocol violation", "certainty before reproduction is a red flag", explicit self-checks at every phase boundary — and the agent bypassed them every time. Textual instructions cannot reliably override an LLM's drive to reach the answer efficiently. **The solution:** Reproduction (`bug-reproduce`) and fix (`bug-fix`) are now separate skill invocations. The fix agent starts cold — it only sees `analysis.json` and `reproduction-report.md` on disk. It has no memory of the analysis phase and cannot reason "I already know the fix from code reading." The mandatory stops are enforced by the conversation boundary, not by agent self-discipline. We first tried a "two-turn" restructure of a single orchestrator (adding explicit "your turn ends here" instructions), which also failed. The split into separate skills is the structural solution. ### Scout server starts at Phase 0, not Phase 1 **The problem:** The Scout server takes 5+ minutes to boot. Starting it at Phase 1 (after ticket analysis) wasted the entire analysis time. **The solution:** `bug-reproduce` kicks off `node scripts/scout.js start-server ... &` at the very beginning of Phase 0. All ticket analysis, subagent research, and code reading happen while the server boots. Phase 1 becomes a checkpoint: wait if no feature flags needed, stop and restart with `config_sets/bug_fixer/kibana.yml` if flags are required. ### Scout server instead of plain dev server `bug-reproduce` uses `node scripts/scout.js start-server` (port 5620) rather than `node scripts/kibana --dev`. The Scout server sets up the `cloud-basic` auth provider required for `auth_provider_hint=cloud-basic` login — the plain dev server does not. ### Parallel subagents for research phases Both `bug-reproduce` (Phase 0) and `bug-fix` (Phase 4 Step 1) dispatch multiple subagents in parallel rather than reading sources sequentially in the main session. This is used in two distinct situations: **Phase 0 — during server boot:** While the Scout server is warming up (5+ minutes), the main agent dispatches subagents in parallel to read each `similar_issue`, review each `related_pr` diff, run closed-issue searches, and study `affected_paths` source files. None of these tasks depend on each other, so they all run simultaneously. By the time the server is ready, the research is done. **Phase 4 Step 1 — root cause analysis:** Before presenting the fix plan, the agent dispatches subagents to review prior fix patterns, map the full impact scope, search codebase conventions, find all call sites, and locate existing tests. Again these are independent tasks that benefit from parallelism. **Why subagents rather than sequential reads in the main session:** - **Context window preservation** — PR diffs and source files are large. Reading them sequentially in the main session would fill the context window with raw content, crowding out the conversation history and skill instructions. Subagents read the content, synthesise it, and return only a summary. - **Context isolation** — Each subagent starts with a clean slate. It cannot be biased by the main session's prior analysis or the agent's forming hypothesis about the root cause. This is especially important for the fix phase: a subagent reviewing a similar PR diff won't be anchored to the main agent's pre-existing suspicion. - **Parallelism** — Independent research tasks complete in the time of the slowest one rather than the sum of all. Subagents are not used for phases that require user interaction (Phase 3 reproduction report, Phase 4 plan approval, Phase 6 PR confirmation) — those are interactive stops that must happen in the main session. **Cursor limitation:** The `Agent` tool that spawns isolated parallel subagents is specific to Claude Code. Cursor has no equivalent. When the skill runs in Cursor, the agent falls back to reading those sources sequentially in the main session — the workflow still completes correctly, but without parallelism, without context isolation between research tasks, and with large file contents accumulating in the main context window rather than being summarised and discarded. Browser reproduction (Phase 3) and all fix phases work identically in Cursor since `cursor-ide-browser` is built in. ### Phase gates hardened against code-analysis shortcuts Beyond the architectural split, several in-skill gates were added after testing revealed specific bypass patterns: - **Sequential execution preamble** in `bug-reproduce`: "Phase 0 analysis tells you where to look. Phase 3 browser reproduction tells you what is actually broken. These are not the same thing." - **Phase 2 hard gate**: server must return `available` at `localhost:5620` before any environment setup begins. - **Phase 3 reframe**: "Have I opened a browser and followed the reproduction steps? If no, do that now before reading any further." Names the exact failure mode: source code reading is not reproduction. - **Phase 4 pre-check** in `bug-fix`: verify `reproduction-report.md` has `status: reproduced` and `user_acknowledged: yes` before reading any source file for fixing purposes. "The more obvious the bug seems from code analysis, the more important this check is." - **Pre-test self-check**: before creating any test file, verify an explicit approval message exists in the conversation. "No exceptions for bugs that seem obvious." ### `user_acknowledged` field protocol The reproduction report includes a `user_acknowledged` field that must be `yes` before fix work begins. Testing revealed agents would self-write this field before the user replied. The skill now explicitly states: "This field must only be written after a real user reply — never pre-emptively. Writing it before the user responds is a protocol violation." ### Scout skill invocation: explicit syntax and Security Solution-specific reviewer `bug-fix` Phase 4 Step 2 specifies the exact `Skill("name")` call syntax for both scout skills, with full file paths to disambiguate between the repo-root generic reviewer and the Security Solution-specific one: 1. `Skill("scout-create-scaffold")` — `.agents/skills/scout-create-scaffold/SKILL.md` 2. `Skill("security-scout-best-practices-reviewer")` — `x-pack/solutions/security/plugins/security_solution/.agents/skills/scout-best-practices-reviewer/SKILL.md` The Security Solution reviewer internally runs the general `scout-best-practices-reviewer` first — agents do not invoke it separately. ### Skill improvement prompts — agents surface rule gaps after each session Both `bug-fix` and `bug-reproduce` end with a `## Skill Improvement` section. After every session the agent checks for: new rationalizations not covered by the Red Flags table, ambiguous phase rules, missing fix strategies, test layer gaps, and undocumented environment conditions. If any are found, it prompts the user before editing any skill file. This mirrors the pattern from `security-scout-best-practices-reviewer`. ### Baseline failures and pressure scenario testing The skill rules were validated using the TDD-for-documentation cycle from `superpowers:writing-skills`: - **RED phase**: three pressure scenarios run *without* the skill — one failure found (plan approval: agent went into advisory mode instead of presenting a formal plan ending with "Do you approve this plan as written?") - **GREEN phase**: same scenarios run *with* the skill — all three pass; the plan approval failure is fixed; agents cite specific Red Flag entries when refusing shortcuts - **10 documented failures** in `baseline-failures.md` covering real agent behaviour observed across four fixing sessions --- ## Potential follow-up: GitHub label-triggered automation The skills in this PR are designed for interactive sessions. A natural next step is a label-triggered workflow where adding `ai-reproduce` to a GitHub issue kicks off reproduction automatically, and `ai-fix` implements the fix — with the issue comments replacing the interactive checkpoints. ### What would need to be built **New infrastructure:** - A GitHub Actions workflow triggered by label events (`ai-reproduce`, `ai-fix`) - A self-hosted runner with a full Kibana dev environment, Scout server, and Playwright available — standard GitHub-hosted runners cannot boot Kibana - `claude` CLI invocation that passes the label event as context to the agent **Changes to the current skills:** The two mandatory human checkpoints would need to be replaced: | Current checkpoint | Automated equivalent | |---|---| | Human reads reproduction report and replies | Agent posts report as issue comment; `ai-reproduce` label counts as acknowledgment | | Human explicitly approves fix plan | Agent posts fix plan as issue comment; `ai-fix` label counts as approval | This requires a small change to `bug-fix` Phase 4 Step 1 — making plan approval conditional on whether the session is interactive or label-triggered — and a corresponding change to how `user_acknowledged` is set in `bug-reproduce` Phase 3. The core skill logic (Phases 0–7) stays unchanged. The label-triggered mode is additive, not a rewrite. --- ## Test plan - [ ] Ask agent to read and follow `bug-reproduce/SKILL.md` for a known Security Solution issue — verify Scout server starts at Phase 0 - [ ] Verify agent presents reproduction report and stops without proceeding to the fix - [ ] Ask agent to read and follow `bug-fix/SKILL.md` — verify it reads `analysis.json` + `reproduction-report.md` before doing anything else - [ ] Verify fix plan is presented and agent waits for explicit approval before writing any code - [ ] Run `bug-fix` without reproduction files — verify agent says "read and follow bug-reproduce first" - [ ] Verify `user_acknowledged` is not written before a real user reply - [ ] Verify agent invokes `security-scout-best-practices-reviewer` (not generic) for Scout tests 🤖 Generated with [Claude Code](https://claude.ai/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

…1212) ## Summary - Adds a validated `collapse` query option to workflow execution listing. - Forwards collapse to Elasticsearch while preserving existing filters, sorting, and pagination. - Covers route, service, and search-helper plumbing with focused tests. ## References Closes elastic/security-team#17562 --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

… auth.authenticator references (elastic#270771) ## Summary Resolves elastic/ingest-dev#7714. When the OTLP Input integration is configured with bearer token authentication via Fleet, the Elastic Agent enters a `DEGRADED` state. The root cause is that Fleet's per-stream OTel config generator suffixes extension keys for cross-stream uniqueness (e.g. `bearertokenauth` → `bearertokenauth/<input-id>-<stream-id>`), but the matching *references* to those extensions were left as bare names. Two reference sites were broken: **1. `service.extensions` array** — spread verbatim from the stream config, causing: invalid configuration: service::extensions: references extension "bearertokenauth" which is not configured Fixed by adding `addSuffixToOtelcolServiceExtensions` and applying it when building the per-stream `service` block. **2. `auth.authenticator` inside component bodies** — the OTLP receiver's protocol blocks (e.g. `receivers.otlp.protocols.grpc.auth.authenticator: bearertokenauth`) still pointed to the bare name, causing: failed to resolve authenticator "bearertokenauth": authenticator not found Fixed by adding `rewriteOtelcolExtensionReferences`, a recursive walker that rewrites `auth: { authenticator }` values using a per-stream `originalToSuffixedExtensionIds` map. Only references matching extensions declared in the same stream are rewritten; external/pre-suffixed references like `beatsauth/<outputId>` are left untouched. ## Testing - Unit tests cover both fix paths and a negative case (external authenticator references are preserved). - `node scripts/jest .../otel_collector.test.ts` — 59 tests pass. You can manually test this by adding a OTLP package policy and verify extension id is rewritten correctly and agent is working as expected <img width="721" height="97" alt="Screenshot 2026-05-22 at 3 17 21 PM" src="https://github.com/user-attachments/assets/37c53b38-f8c6-4918-932e-6044beeed763" /> <img width="683" height="247" alt="Screenshot 2026-05-22 at 3 17 12 PM" src="https://github.com/user-attachments/assets/7dd1f9d7-89e5-42f5-b525-4a8c4e6ce7c4" /> <img width="730" height="155" alt="Screenshot 2026-05-22 at 3 16 59 PM" src="https://github.com/user-attachments/assets/4eccb99c-4bb1-4eed-8277-6d60a3ca3d26" /> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

elastic#271144) ## Summary Doing the same changes over each discover FTR config to cut CI runtime: - Add one `await esArchiver.loadIfNeeded('X')` in the index file's `before` hook. - Delete the per-child `loadIfNeeded('X')` calls. - Delete any `esArchiver.unload('X')` in the index after hook and in children. Since we stop servers after FTR config is finished we are losing quite some time unloading the data. Some numbers: - `loadIfNeeded` calls eliminated per CI run: 43 - `esArchiver.unload(...)` calls removed: **57** - Total esArchiver ops eliminated per CI run: ~99

…c#270571) ## Problem When a watchlist is created with entity sources, or when a new entity source is added to an existing watchlist, the watchlist index stays empty until the scheduled background task runs (roughly every hour). This means users have to wait or manually trigger a sync. ## Solution This change adds a fire-and-forget sync call at the end of both create routes. The sync runs in the background after the HTTP response is returned so it does not affect response time. If the sync fails, the error is logged as a warning but the watchlist creation still succeeds. ## Manual Testing 1. Start Kibana with a valid Elasticsearch cluster 2. Call POST /api/entity_analytics/watchlists with a body that includes entitySources 3. Check server logs for the message "Background sync completed for watchlist" 4. Verify the watchlist index is populated without waiting for the scheduled task 5. Test the failure path by mocking syncWatchlist to throw and confirming the API still returns 200 Closes elastic/security-team#17406 --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

…elastic#271237) Co-authored-by: Cursor <cursoragent@cursor.com>

Closes elastic/search-team#14414 part of elastic/search-team#14205 ## Summary Enables users to create skills in chat with the help of an agent. Uses the attachment UI, providing actions to users for previewing and saving skills. Allows agents to iterate on drafts and make user requested changes to produce multiple versions. <img width="825" height="655" alt="image" src="https://github.com/user-attachments/assets/1b1779fa-b627-4680-8422-bef22abfb81a" /> <img width="819" height="612" alt="image" src="https://github.com/user-attachments/assets/c6e88b46-c2a9-46e4-80be-8828dec44d32" /> ### Release note Adds ability to create skills directly in agent builder chat. ### Checklist Check the PR satisfies following conditions. Reviewers should verify this PR satisfies this list as well. - [ ] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [ ] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker) - [ ] This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The `release_note:breaking` label should be applied in these situations. - [ ] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [ ] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - [ ] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels. ### Identify risks Does this PR introduce any risks? For example, consider risks like hard to test bugs, performance regression, potential of data loss. Describe the risk, its severity, and mitigation for each identified risk. Invite stakeholders and evaluate how to proceed before merging. - [ ] [See some risk examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) - [ ] ... --------- Co-authored-by: Zachary Parikh <zachary.parikh@elastic.co> Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Ryan Keairns <contactryank@gmail.com> Co-authored-by: pgayvallet <pierre.gayvallet@elastic.co>

…9822) ## Summary This PR migrates osquery to the V2 unified registry. No change to UI or any existing behavior. Added api integration tests for legacy and unified input. **How to test** Feature flag: `xpack.cases.attachments.enabled` Flag on: attachment created as `cases-attachments` SO Flag off: attachment created as `cases-comments` SO Osquery added during flag on is hidden when turning the feature flag off. https://github.com/user-attachments/assets/6e637173-a4fa-4a7a-b9f6-f16bb29c9675 ### Checklist - [x] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [ ] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker) - [ ] This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The `release_note:breaking` label should be applied in these situations. - [ ] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [x] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - [x] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels.

) ## Summary Negated indices (with leading `-`) are not part of the query, therefore we don't need to check them. Beyond that, it breaks `_has_privileges` es check.

## Summary - Replace the OS checkbox group (with platform icons) on the pack query flyout and saved-query form with an `EuiComboBox` multiselect labelled "Operating systems". The combobox is required (no clear-all X, validation error if emptied) and new queries seed `DEFAULT_PLATFORM` so all three pills are pre-selected. - In the pack queries table: remove the "Query" column, rename "Platform" to "Operating systems", and render each OS as a hollow `EuiBadge` pill instead of an icon. <img width="1726" height="1327" alt="Screenshot 2026-05-19 at 3 51 25 PM" src="https://github.com/user-attachments/assets/b6871c61-ccdf-4ecd-97af-2a05d529b6c8" /> <img width="1726" height="1323" alt="Screenshot 2026-05-19 at 3 51 42 PM" src="https://github.com/user-attachments/assets/b53ce3c2-91a2-46ea-8814-e1104e7229ed" /> <img width="1724" height="1276" alt="Screenshot 2026-05-19 at 3 52 11 PM" src="https://github.com/user-attachments/assets/964687b6-216e-4cc3-8382-7bee2305f1e1" /> ## Test plan - [ ] Pack edit → Attach next query: all three OS pills pre-selected, no clear-X - [ ] Remove all pills + click Save: validation error blocks submit - [ ] Edit a query saved with empty platform: reopens with three pills - [ ] Saved query form: same multiselect behaviour - [ ] Pack edit queries table: no Query column; "Operating systems" header; pill badges Closes elastic/security-team#17022 --------- Co-authored-by: Tomasz Ciecierski <tomasz.ciecierski@elastic.co>

…endpoint navigation links with the correct capabilities (elastic#257966) ## Summary If a user creates a role with "All" base privileges in the Kibana privileges section, we expect the user to only have limited access to the Endpoint Management section. Only global artifact management and endpoint exceptions should be accessible. Full access requires explicit enabling of the security sub-feature sections. Although the Privilege Summary correctly shows that the user does not have access to pages like the Endpoint List, policies and artifacts, when you click on the side navigation panel or navigate to Security > Manage, these links were still visible to the user. This PR fixes how the base `ALL` and `READ` privileges handles SIEMV5 (Endpoint Management) features and correctly aligns the visibility of the links to the Privilege summary. The API already correctly handled restricting privileges, so this adjustment only affects the UI. The `CUSTOMIZE` base privilege already works as expected. - [x] Adds `excludeBasePrivileges` to the security sub-features definition - [x] Only Endpoint exceptions and global artifact management UI features show when base privilege is `ALL` - [x] Only Endpoint exceptions UI features show when base privilege is `READ` ### To Test: - Create a role in Stack Management and only adjust the Kibana privileges section where the Space is `*All Spaces` and the Base Privilege (Define Privileges section) is All. - Sign into that user and observe that no asset management links are visible. - API Error toasts are expected ### Screenshots <img width="769" height="675" alt="image" src="https://github.com/user-attachments/assets/68097911-b9cd-4c40-bb4f-97b74b085fc7" /> Endpoint exceptions is set to READ when base privilege is READ <img width="498" height="786" alt="image" src="https://github.com/user-attachments/assets/2a1e06e2-97b2-416a-b6d7-dc837ef6b398" /> Side navigation only shows Artifacts > Endpoint exceptions and no other links <img width="1308" height="921" alt="image" src="https://github.com/user-attachments/assets/6bd4aaf9-88a1-484e-ad43-73896e652efa" /> --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

Update serverless to auto enable WL feature flag --------- Co-authored-by: Ying Mao <ying.mao@elastic.co>

…art-section errors (elastic#270371) Closes: elastic#265117 Adds observability for non-render errors raised by the Metrics Experience chart section (`useFetchMetricsData`, `useLensProps`, `useMetricSourceKind`) through **two complementary sinks**: - **APM (`@elastic/apm-rum`)** — primary sink. Every fetch/build/classification failure that lands at one of the three call sites is captured via `apm.captureError` with structured correlation labels, and the surrounding transaction is marked failed via a `chart-section-non-render-error` span (mirrors `lens/data_loader.ts`). - **`@kbn/logging`** — wired into the package via `ExternalServices.logger`. Currently surfaces APM transport failures (the inner-catch around `apm.captureError`) tagged `error_type=APMReportingFailure`. The plumbing (provider, `useReportChartSectionError` hook, `log_labels.ts` inventory, `logger_utils.ts` adapter) is in place so future code in this package can emit structured logs against the same vocabulary without re-introducing ad-hoc `console.error` calls. Together this gives operators server-side aggregation (APM) for the common error paths and a grep-able log signal for the rare cases where APM itself fails. ### How to test Add this inside `fetchSourceKind` in `src/platform/packages/shared/kbn-unified-chart-section-viewer/src/components/flyout/hooks/use_metric_source_kind.ts`: ```ts // `SMOKE_TEST` is widened to `boolean` so TS keeps the rest of the function // reachable (otherwise dead-code analysis loses narrowing on `item`). const SMOKE_TEST = true as boolean; if (SMOKE_TEST) { throw new Error(`APM smoke test: useMetricSourceKind fetch for "${name}"`); } ``` Then open Discover with a metrics data view and open a metric flyout — that mounts `useMetricSourceKind` and exercises the `reportChartSectionError` → `apm.captureError` path. **APM check** — service `kibana-frontend`, Errors tab, KQL `labels.chart_section_source : "useMetricSourceKind"`: <img width="1307" height="820" alt="image" src="https://github.com/user-attachments/assets/959f808d-78b8-4be4-ba81-7bb11b07334c" /> **Logger check** — the package logger only runs in the `catch (reportingError)` fallback of `reportChartSectionError`. To exercise it, force `apm.captureError` to throw from DevTools console: ```js const original = window.elasticApm.captureError; window.elasticApm.captureError = () => { throw new Error('forced APM failure'); }; // reproduce the action; then restore: window.elasticApm.captureError = original; ``` You should see an `ERROR` entry in the browser console with context `metrics-data-source-profile` and labels `{ error_type: 'APMReportingFailure', chart_section_source: 'useMetricSourceKind' }`: <img width="1490" height="214" alt="image" src="https://github.com/user-attachments/assets/1938921d-930f-4320-8483-37463ded065b" /> ### Checklist Check the PR satisfies following conditions. Reviewers should verify this PR satisfies this list as well. - [ ] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [ ] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker) - [ ] This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The `release_note:breaking` label should be applied in these situations. - [ ] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [ ] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - [ ] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels. ### Identify risks Does this PR introduce any risks? For example, consider risks like hard to test bugs, performance regression, potential of data loss. Describe the risk, its severity, and mitigation for each identified risk. Invite stakeholders and evaluate how to proceed before merging. - [ ] [See some risk examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) - [ ] ... --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

…ution public methods (elastic#269308) ## Summary Refactor on top of elastic#268467 which introduced a `refresh` option on `ResolutionClient.linkEntities`/`unlinkEntities`. **What changed:** - Replaces ES-vocabulary `refresh` with domain-named `awaitVisibility` (default `false`) on the API - Two-layer naming convention: the infra layer (`bulkUpdateEntityDocs`) keeps ES vocabulary with a corrected default of `false`; the domain layer exposes `awaitVisibility` and translates internally - UI route handlers (`link`, `unlink`) pass `{ awaitVisibility: true }` to get read-your-writes semantics after a user-triggered operation - Background maintainer drop the now-redundant explicit `{ refresh: false }` — the new default covers it **Why:** `refresh` leaks Elasticsearch vocabulary into the domain layer and requires callers to know what `'wait_for'` means. `awaitVisibility` is self-documenting and hides the translation detail. ### Checklist - [x] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md) - [x] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker) - [x] This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The `release_note:breaking` label should be applied in these situations. - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [x] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - [x] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels. ### Identify risks Low risk — pure internal rename refactor within the `entity_store` plugin. - No public API surface changed (all call sites are plugin-internal) - No behavior change: same Elasticsearch semantics, same effective defaults - No deployment-mode divergence (stateful/serverless unaffected)

## Summary Part of elastic/kibana-team#3344 Extracts the app header infrastructure from the Chrome Next integration work in [elastic#259318](elastic#259318) into a focused PR. This adds: - `@kbn/app-header` shared package with inline and Chrome-owned app header rendering APIs. - `chrome.next.appHeader.set()` plus internal state, lifecycle cleanup, mocks, and layout wiring. - Chrome-owned app header rendering in the Chrome Next project layout. - Focused hardening for content detection, registration cleanup, legacy badge fallback, and public type exports. - Package README and targeted unit coverage for the new app-header behavior. This intentionally does not migrate any apps yet and does not pull in unrelated Chrome Next slices such as side nav, user menu, feedback handlers, or broader help menu changes. ## Context The original integration branch includes app migrations and additional Chrome Next features. This PR extracts only the app-header foundation so it can be reviewed and merged independently before route-by-route adoption. Follow-up created: [elastic#271295](elastic#271295) to make the static “Add integrations” action access-aware. ## Risk Low to medium. The new APIs are behind Chrome Next behavior and currently have no app adopters in this PR, but the changes touch shared Chrome layout state. Risk is mitigated with focused unit coverage and existing Chrome validation checks. --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

…astic#271380) ## Summary Adds an event lifecycle endpoint and timeline UI so users can trace the full chain of detections → discoveries → verdicts → event versions when clicking a significant event. Also introduces search and filter controls on the events tab. ### Lifecycle - New `GET /internal/sig_events/events/{id}/lifecycle` endpoint that walks the event chain via `previous_event_id`, collects related discoveries and verdicts in parallel, and deduplicates detections - Flyout with event details, root cause, recommendations, evidences, and a chronological lifecycle timeline ### Filters & search - Added verdict, impact, and stream filter popovers to the events tab - Added debounced text search - Route accepts array-based query params for multi-select filters https://github.com/user-attachments/assets/2a11830b-f726-45b2-b110-10810ccf63cf --------- Co-authored-by: Cursor <cursoragent@cursor.com>

## Summary Registers the three KI workflows (features identification, queries generation, onboarding) as managed workflows via the `workflows_extensions` plugin, and delegates memory generation to Task Manager. ### Managed workflow registration - Adds three YAML workflow definitions under `kbn-workflows/managed/definitions/streams_ki/` - Registers the `streams` plugin as a managed workflow owner during `setup()` - Installs all three workflows as global (`spaceId: '*'`) during `start()` with parallel installs - Workflow IDs use the reserved `system-` prefix: `system-streams-ki-features-identification`, `system-streams-ki-queries-generation`, `system-streams-ki-onboarding` ### Memory generation endpoint Changes `POST /internal/streams/{streamName}/memory/_generate` to delegate to Task Manager: - Returns `{ acknowledged: true }` immediately after scheduling a `streams_memory_generation` task - Uses the same singleton task pattern as the onboarding task, with persistence, retry, and abort handling provided by Task Manager - Eliminates request-scoped `inferenceClient` lifecycle concerns (the task runner uses `fakeRequest` with a persisted API key) ## Test plan - [x] Kibana starts and installs the three managed workflows without errors - [x] Managed workflows are accessible at `/app/workflows/system-streams-ki-onboarding` (and the other two IDs). They are not listed in the Workflows UI by default since the list filters out managed workflows - [x] Testing a managed workflow from the editor resolves child managed workflows at runtime https://github.com/user-attachments/assets/d54011a3-5014-445d-a38c-47a2fa9ea5bb --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

## Summary `get_trace_change_points` uses a fixed_interval date histogram tied to APM's rollup interval (minimum 1m). ES's `change_point `aggregation requires at least 22 buckets, but a 15-minute window at 1m resolution only produces ~7–15 buckets, causing every result to return indeterminable: not enough buckets. This is consistently triggered by the investigation skill passing the screen context time range (often now-15m) to all tool calls. ### Fix Enforce a 30-minute floor on the effective start time in the handler. If the requested window is shorter than 30 minutes, start is silently extended back to end - 30m. At 1m resolution, 30 minutes guarantees ≥22 buckets. Longer windows are unaffected.

## Summary Contributes to elastic/docs-content#6591 by adding the 9.4.2 Kibana release notes. Preview - https://docs-v3-preview.elastic.dev/elastic/kibana/pull/270302/release-notes --------- Co-authored-by: Florent LB <florent.leborgne@elastic.co>

## Summary Contributes to elastic/docs-content#6591 by adding the 9.3.5 Kibana release notes. Preview - https://docs-v3-preview.elastic.dev/elastic/kibana/pull/270299/release-notes#kibana-9.3.5-release-notes --------- Co-authored-by: Florent LB <florent.leborgne@elastic.co>

…se notes (elastic#270301) ## Summary Contributes to elastic/docs-content-internal#1223. Updates known issue entry about how upgrading to 9.3.x fails when a rule action contains oversized content. The workaround details have been updated and resolution information has been added. Observability and Security known issue release notes being updated via elastic/docs-content#6645. Preview - https://docs-v3-preview.elastic.dev/elastic/kibana/pull/270301/release-notes/known-issues

…Builder:experimentalFeatures (elastic#270501) ## Summary - Adds an `experimental` flag to `UiSettingsParams` as a mutually exclusive alternative to `technicalPreview`. TypeScript enforces that a setting can carry at most one maturity badge. - Introduces a new `FieldTitleExperimentalBadge` component that renders "Experimental" (instead of "Technical preview") in the Advanced Settings UI, wired through `FieldDefinition` and `getFieldDefinition`. - Switches `agentBuilder:experimentalFeatures` from `technicalPreview: true` to `experimental: true` to align with updated Elastic terminology guidelines ([Slack thread](https://elastic.slack.com/archives/C0A2RUHDJCB/p1779223108141119)). ## Details The existing `technicalPreview` field on `UiSettingsParams` was a plain `interface` property. To enforce mutual exclusivity with the new `experimental` field, `UiSettingsParams` is now a discriminated union: one branch carries `technicalPreview` with `experimental?: never`, and the other carries `experimental` with `technicalPreview?: never`. TypeScript will error at compile time if both are set. The new `experimental_badge.tsx` lives alongside the existing `technical_preview_badge.tsx` in `kbn-management/settings/components/field_row/title/`. `title.tsx` renders whichever badge is applicable (at most one, by type constraint). ## Screenshots ### Current <img width="1910" height="184" alt="Screenshot 2026-05-19 at 3 34 58 PM" src="https://github.com/user-attachments/assets/745136b2-5b97-483e-93e4-bd1044346155" /> ### Updated <img width="941" height="130" alt="Screenshot 2026-05-21 at 2 47 42 PM" src="https://github.com/user-attachments/assets/d06423e5-3e91-4517-9231-fa83cde8b7e9" /> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

…ing (elastic#270446) **Epic:** elastic/security-team#12367 (internal) **Resolves: elastic#262502** ## Summary Instruments Security Solution's `DetectionRulesClient` (DRC) and some API routes directly with rule changes history functionality. It also involved AF's `RulesClient` instrumentation streamlining to facilitate the implementation. ## Details Instrumentation boils down to passing the rule changes history context information down the road via the chain Security Solution API endpoint -> `DRC` -> `RulesClient` -> `@kbn/changes-history` package. There are two parameters passing from DRC which are - change tracking `action` It should reflect domain specific change action. From that POV we have methods where the action is clear (as minimum for now) like `delete` or `bulkDelete` and methods where action depends on the upstream context like `create` or `update`. For example Security solution uses `RulesClient.create()` for prebuilt rules management introducing domain specific actions like prebuilt rules installation upgrade and etc. - `metadata` Rule change tracking action related metadata. - `metadata.bulkCount` Performance optimization in the consumer code like chunking makes it impossible to capture the real number of rules the bulk operation is applied to. Consumer code may pass `bulkCount` when it's necessary. Besides that `bulkCount` is supported by some non-bulk methods as they don't have bulk counterparts. For non-bulk methods with bulk counterparts `bulkCount` isn't exposed. - `metadata.originalRuleSoId` Rule's Saved Object identifier saved upon rule duplication. ### Changes **Alerting plugin / `@kbn/alerting-types`** - `RuleChangeTracking` made generic (`RuleChangeTracking<ChangeAction extends string = string>`) so consumers can restrict the `action` field to their own enum without wrapping the type. - `create_rule` and `update_rule` wired to accept `changeTracking?: RuleChangeTracking` and log the action via `logRuleChanges`. - `bulk_delete_rules` and `bulk_edit_rules` accept `changeTracking?: Omit<RuleChangeTracking, 'action'>` — the action is implicit for these operations; `bulkCount` is provided by the caller to track totals across processing chunks. **Security Solution common** - New `common/detection_engine/rule_management/rule_change_tracking.ts` introduces `SecurityRuleChangeTrackingAction` enum (`ruleInstall`, `ruleUpgrade`, `ruleDuplicate`, `ruleImport`, `ruleRevert`) and `SecurityRuleChangeTracking` type alias. **Detection Rules Client** - `IDetectionRulesClient` interface: all mutating methods accept optional `SecurityRuleChangeTracking`. - Each method passes `changeTracking` through to the underlying `RulesClient` call. Methods with a fixed semantic (`importRule` → `ruleImport`, `upgradePrebuiltRule` → `ruleUpgrade`, `revertPrebuiltRule` → `ruleRevert`) always inject the correct default action, allowing callers to supply `bulkCount` without overriding the action. **Security Solution API routes / handlers** - `PUT /api/detection_engine/rules/_import` — passes `changeTracking: { action: ruleImport }` to the DRC. - `PUT /internal/detection_engine/prebuilt_rules/installation/_perform` — passes `changeTracking: { action: ruleInstall, bulkCount }`. - `PUT /internal/detection_engine/prebuilt_rules/upgrade/_perform` — passes `changeTracking: { action: ruleUpgrade, bulkCount }`. - `PUT /internal/detection_engine/prebuilt_rules/revert` — passes `changeTracking: { action: ruleRevert }`. - `PUT /api/detection_engine/rules/prepackaged` (legacy) — passes `changeTracking: { action: ruleInstall, bulkCount }`. - Integration paths (endpoint security and promotion rule installation) pass `changeTracking: { action: ruleInstall, bulkCount }` programmatically. ## How to test This change is a no-op without explicit opt-in. To exercise the new code paths locally: 1. Set [FLAGS.FEATURE_ENABLED](https://github.com/elastic/kibana/blob/main/x-pack/platform/packages/shared/kbn-change-history/src/constants.ts#L31) to `true` in **@kbn/change-history** package 2. 3. Enable feature flags ```yaml xpack.alerting.ruleChangeTracking.enabled: true xpack.securitySolution.enableExperimental: - ruleChangesHistoryEnabled ``` 4. Make changes 3a. Install one or more prebuilt rules (`PUT /internal/detection_engine/prebuilt_rules/installation/_perform`). Open a freshly installed rule and verify the changes history shows an entry with action `rule_install`. 3b. Upgrade one or more prebuilt rules (`PUT /internal/detection_engine/prebuilt_rules/upgrade/_perform`). Verify changes history shows `rule_upgrade`. 3c. Revert a customized prebuilt rule (`PUT /internal/detection_engine/prebuilt_rules/revert`). Verify changes history shows `rule_revert`. 3d. Import a rule ndjson file via **Manage Rules → Import** (`PUT /api/detection_engine/rules/_import`). Verify changes history shows `rule_import`. 5. Make a request to `GET /internal/detection_engine/rules/_history` to explore the change history for each rule you changed above ```bash curl -H 'Content-Type: application/json' -H 'kbn-xsrf: kibana' -H "elastic-api-version: 1" -H "x-elastic-internal-origin: true" -u elastic:changeme 'http://localhost:5601/kbn/internal/detection_engine/rules/<rule_so_id>/history' ``` - Verify **FTR integration tests** added under `x-pack/solutions/security/test/security_solution_api_integration/test_suites/detections_response/rules_management/rule_management/trial_license_complete_tier/change_tracking.ts` pass. ### Identify risks - Low risk: all `changeTracking` parameters are optional and additive. Existing behavior is fully preserved when the parameter is omitted.

…c#270426) ## Summary Closes elastic/search-team#14522 Adds per-provider EARS feature flagging via a two-tier system: - **Stable providers** (Microsoft, Slack): enabled whenever `xpack.actions.auth.ears.enabled: true` - **Experimental providers** (Google): only enabled when *both* `ears.enabled: true` **and** `ears.enableExperimental: true` This allows us to ship EARS for verified OAuth providers while keeping unverified ones (Google, pending app verification) available only for internal dogfooding. ### How it works - Each connector spec's EARS auth type entry can declare `experimental: true` (Google Calendar, Gmail, Google Drive do this) - A new `xpack.actions.auth.ears.enableExperimental` boolean config controls whether experimental EARS providers are available - The filtering happens at schema generation time (`generateSecretsSchemaFromSpec`), so both the UI and API are gated - Existing EARS connectors for experimental providers show as disabled in the connectors table when `enableExperimental` is off ### Promotion flow When Google's OAuth app verification completes: 1. Remove `experimental: true` from the 3 Google specs (one-line diff each) 2. **No deployment config changes needed** — Google EARS "just works" for everyone ### Config ```yaml # kibana.yml xpack.actions.auth.ears: enabled: true # global EARS gate (existing) enableExperimental: true # opt-in for unverified providers (new) ``` ### Changes | Area | What | |------|------| | `kbn-connector-specs` | `AuthTypeDef.experimental` flag, filtering in `generateSecretsSchemaFromSpec`, `isEarsExperimentalConnector` helper | | `actions` plugin | `ears.enableExperimental` config, `isEarsExperimentalEnabled()` utility, exposed to browser | | `stack_connectors` | Thread `isEarsExperimentalEnabled` through client-side schema generation | | `agent_builder` | Per-provider disabled check in connectors table | | `triggers_actions_ui` | Per-provider disabled check in connectors list | | Google specs | `experimental: true` on EARS auth type (google_calendar, gmail, google_drive) | ## Test plan - [ ] With `ears.enabled: true` and no `enableExperimental`: Microsoft/Slack connectors show EARS option, Google connectors do not - [ ] With `ears.enabled: true` and `enableExperimental: true`: all connectors show EARS option - [ ] With `ears.enabled: false`: no connectors show EARS option regardless of `enableExperimental` - [ ] Creating a Google EARS connector via API fails when `enableExperimental` is off - [ ] Previously created Google EARS connectors show as disabled when `enableExperimental` is turned off - [ ] Unit tests pass: `node scripts/jest src/platform/packages/shared/kbn-connector-specs/` - [ ] Unit tests pass: `node scripts/jest x-pack/platform/plugins/shared/actions/server/` 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

) ## Summary > [!NOTE] > This only contains the server side changes for the entity behavior feature. UI changes to come in subsequent PRs. Introduces a new entity maintainer (`ml-anomaly-detection-jobs`) that maintains the `entity.behaviors.anomaly_job_ids` field for an entity in the entity store. This maintainer runs every 24 hours and looks back 90 days in order to capture all of the anomalous behavior for an entity in the last 90 days. During each run, the maintainer: - Iterates over user and host entities from the entity store in batches - For each batch, fetches anomaly records from security ML jobs for the last 90 days and above the configured threshold minimum - If anomaly records exist for an entity, its entity store entry is updated to include the anomaly job ID. - If anomaly records exist for an entity, additional supporting details will be queried and stored in a details datastream `.entity_analytics.ml-ad-jobs-latest-${namespace}` The additional details that are fetched are job dependent: For jobs that use the `rare` function (for example, rare country login), only the anomalous value is stored in the anomaly record (for example, `Iran`). In order to determine the baseline behavior for the entity, we use the ML job configuration to aggregate against the source index (for example, an aggregation to determine where an entity commonly logs in) For other job types that are metric or count functions (for example, high number of failed logins), the record document contains the typical value and the anomalous value so we already have the baseline behavior. For all job types, we grab the latest 3 anomalous documents. This is to support the "Raw Evidence" portion of the expanded section in the initial UI mockups. Note that the exact format of these documents may change as we finalize the mockups but since this feature is behind a feature flag, it should be ok to merge and finalize later. <img width="463" height="374" alt="Screenshot 2026-05-18 at 4 12 06 PM" src="https://github.com/user-attachments/assets/19af8c51-650d-40b2-8b9e-548daee8ac5e" /> ## To Verify 1. Modify the default lookback period of the entity store logs extraction task (because we're populating historical data) ``` --- a/x-pack/solutions/security/plugins/entity_store/server/domain/saved_objects/global_state/constants.ts +++ b/x-pack/solutions/security/plugins/entity_store/server/domain/saved_objects/global_state/constants.ts @@ -10,14 +10,14 @@ import { z } from '@kbn/zod/v4'; export const DEFAULT_HISTORY_SNAPSHOT_FREQUENCY = '24h'; export const LOG_EXTRACTION_DELAY_DEFAULT = '1m'; -export const LOG_EXTRACTION_LOOKBACK_PERIOD_DEFAULT = '3h'; +export const LOG_EXTRACTION_LOOKBACK_PERIOD_DEFAULT = '30d'; export const LOG_EXTRACTION_FREQUENCY_DEFAULT = '1m'; // Max amount of entities to extract in one ESQL query export const LOG_EXTRACTION_DOCS_LIMIT_DEFAULT = 10000; // Max raw log documents per logs to be processed in a query (inside elastic search) export const LOG_EXTRACTION_MAX_LOGS_PER_PAGE_DEFAULT = 40000; export const LOG_EXTRACTION_TIMEOUT_DEFAULT = '59s'; -export const LOG_EXTRACTION_MAX_TIME_WINDOW_SIZE_DEFAULT = '15m'; +export const LOG_EXTRACTION_MAX_TIME_WINDOW_SIZE_DEFAULT = '1d'; // Max total raw log documents to process per task run; 0 = no cap ``` 2. Start ES and Kibana with the following feature flags: ``` uiSettings.overrides: securitySolution:entityStoreEnableV2: true xpack.securitySolution.enableExperimental: - entityAnalyticsEntityStoreV2 - entityAnalyticsWatchlistEnabled - entityAnalyticsNewHomePageEnabled - leadGenerationEnabled - entityAnalyticsMlJobBehaviorMaintainer ---->> !!! NEW FEATURE FLAG FOR THIS PR !!! ``` 3. Use this script to populate some data: https://gist.github.com/ymao1/d35d356f090e23c746055446cc21fba0. NOTE!!: You may need to modify the Kibana URL if you're using a different base path or SSL You will need to also download these scripts that are referenced by the above script. - Rare region data: https://gist.github.com/ymao1/3f8d1214928b5c27aa505a20b7f2425d - High login count: https://gist.github.com/ymao1/fbbdbcf7552455fd155ee52ffcddf67a 4. Verify the maintainer is started in Dev Tools ``` GET kbn:/internal/security/entity_store/entity_maintainers?apiVersion=2 ``` Response should include the new `ml-anomaly-detection-jobs` maintainer and the status should be `started` ``` { "maintainers": [ { "id": "ml-anomaly-detection-jobs", "taskStatus": "started", "interval": "1d", "description": "Entity Analytics ML Anomaly Detection Maintainer", "nextRunAt": "2026-05-19T12:30:27.957Z", "minLicense": "platinum", "customState": {}, "runs": 1, "lastSuccessTimestamp": "2026-05-18T12:30:30.117Z", "lastErrorTimestamp": null }, ] } ``` 5. Manually run the maintainer ``` POST kbn:/internal/security/entity_store/entity_maintainers/run/ml-anomaly-detection-jobs?apiVersion=2 ``` You should see this info log when the maintainer is done: ``` [2026-05-19T17:45:00.929-04:00][INFO ][plugins.securitySolution.ml-anomaly-detection-jobs-default] Maintainer run completed in 2570ms ``` 6. After the maintainer runs, you should see some entities populated with behavior data ``` GET .entities.v2.latest.security_default-00001/_search { "query": { "bool": { "filter": [ { "exists": { "field": "entity.behaviors.anomaly_job_ids" } } ] } } } ``` and you should see entries in the details index ``` GET .entity_analytics.ml-ad-jobs-latest-default/_search ``` --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

…stic#271250) ## Summary Doing the same changes over each ML FTR config to cut CI runtime: - Add one `await esArchiver.loadIfNeeded('X')` in the index file's `before` hook. - Delete the per-child `loadIfNeeded('X')` calls. - Delete any `esArchiver.unload('X')` in the index after hook and in children. Since we stop servers after FTR config is finished we are losing quite some time unloading the data. Some numbers: - `loadIfNeeded` calls eliminated per CI run: 46 - `esArchiver.unload(...)` calls removed: **27** - Total esArchiver ops eliminated per CI run: ~ 73 Since each FTR config gets its own fresh ES+Kibana instance, none of afterAll` top level hooks have any effect on subsequent configs. Saving time by removing it and related calls: - `ml.securityUI.logout()` — browser session is killed with the server anyway - `ml.securityCommon.cleanMlUsers/Roles()` — ES security objects destroyed with the server - `ml.testResources.resetKibanaTimeZone()` — Kibana instance destroyed with the server - `esNode.unload() (anomaly_detection_jobs group1–4)` — also redundant (same reasoning as our earlier work)

…c#271425) ## Summary Fix typo in the smoke-tests evaluation suite path. Details: - Smoke tests were added as part of elastic#271249 - Due to the last minute refactor (renaming suite directory), this change slipped through the cracks; auto-merge on CI success happened because the suite never ran. Root cause: The pipeline's `readEvalsSuiteMetadata()` function silently filters out any suite whose config file doesn't exist in the git tree. So despite the `evals:smoke-tests` label being present and the defaultModelGroups being correctly configured, the suite was being dropped before label matching ever happened.

## Summary After elastic#260835, the `i18n.locale` setting became deprecated and replaced with `i18n.defaultLocale`. This PR updates the `kibana.yml` to reference the current setting name. ### Checklist Check the PR satisfies following conditions. Reviewers should verify this PR satisfies this list as well. - [X] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md) - ~~[ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials~~ - ~~[ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios~~ - [X] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker) - ~~[ ] This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The `release_note:breaking` label should be applied in these situations.~~ - ~~[ ] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed~~ - [x] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - ~~[ ] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels.~~ --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

) ## What & Why Improves the YAML template editor experience across three areas: 1. **Visual diff highlighting** — Adds gutter decorations so users can see which lines changed since the last save (similar to the Workflows editor). Fixes the "unsaved changes" badge incorrectly appearing after saving and re-opening a template, caused by stale local storage drafts not being cleared on save and a race condition where `form.reset` overwrote draft values. 2. **Better schema validation errors** — Converts the field-level `oneOf`/`anyOf` union in the generated JSON Schema into `if`/`then` chains keyed on the `control` property. This makes Monaco YAML produce contextual errors (e.g., "type must be 'long' | 'integer' | ...") instead of the confusing "control must be INPUT_TEXT | SELECT_BASIC | ...". Also fixes `addDiscriminatorEnumHints` to correctly extract values from unions of literals (not just `const`), so all valid type options for `INPUT_NUMBER` fields appear in autocomplete. 3. **Server-side definition validation on save** — The POST/PUT/PATCH template routes previously only checked that the YAML was syntactically parseable (`yaml.load()`), allowing semantically invalid templates (e.g., `type: keyword` on an `INPUT_NUMBER` field) to be persisted. Now validates the parsed YAML against `ParsedTemplateDefinitionSchema` before saving and returns a `400 Bad Request` with specific Zod validation issues if the definition is invalid. Additionally, improves autocomplete tooltip labels for the `fields` property to show descriptive field type names (e.g., "Text Input", "Select") instead of generic "object". ## How to Test **Diff highlighting & unsaved changes badge:** 1. Start Kibana, go to Cases > Templates, open an existing template for editing. 2. Change a line (e.g., the `name` field) — verify a yellow gutter marker appears on the changed line and the "Unsaved changes" badge shows in the header. 3. Navigate away (back to templates list), then navigate back — verify the draft is preserved and diff/badge still show. 4. Click Save — verify you're redirected to the list. Re-open the same template — verify no badge and no gutter markers. 5. Go to Create Template, make edits, save — verify re-opening Create Template starts fresh with the example definition (no badge). **Schema validation & autocomplete:** 6. In the YAML editor, type `fields:` and trigger autocomplete on a field entry — verify the suggestion labels show descriptive names (e.g., "Text input", "Select") instead of "object". 7. Add a field with `control: INPUT_NUMBER` and `type: integer` — verify no validation error appears. 8. Change `type: integer` to `type: keyword` — verify the error message references `type` (not `control`), listing valid numeric types. **Server-side validation:** 9. Attempt to save a template with an invalid field definition (e.g., `type: keyword` with `control: INPUT_NUMBER`) — verify the save fails with a toast error (400 Bad Request) and the invalid template is not persisted. --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

) Closes elastic#260667 ## Summary Aligns Metrics in Discover `METRICS_INFO` failures with main Discover search errors by replacing the custom `MetricsInfoError` component with Discover’s shared `ErrorCallout` (via a `ChartSectionSearchError` wrapper). ES|QL error handling is centralized under `src/common/errors/` so Metrics and Traces can reuse the same path, including HTTP 200 responses with an embedded Elasticsearch error body. Discover injects `showErrorDialog` and `esqlReferenceHref` from the metrics profile wrapper—the same pattern as `discover_layout.tsx` after [elastic#261332](elastic#261332). ### Changes - **Error handling** - Moved `esql_response_error` to `src/common/errors/` and improve `formatErrorCause` (all `root_cause` entries, `caused_by` fallback) - Added `normalizeChartSectionSearchError` - Update `execute_esql_query` and `report_chart_section_error` imports to the shared module - **UI** - Added `ChartSectionSearchError` wrapping `@kbn/discover-utils` `ErrorCallout` - Metrics Experience Grid now render `ChartSectionSearchError` on `| METRICS_INFO` failure - Removed `metrics_info_error.tsx` - **Discover host wiring** - `chart_section.tsx` passes `chartSectionSearchError` with `core.notifications.showErrorDialog` and `docLinks.links.query.queryESQL` (same behaviour as main Discover `ErrorCallout`) - Added `ChartSectionSearchErrorHostProps` to `UnifiedMetricsGridProps` - **i18n** - Remove unused `metricsExperience.metricsInfoError.*` keys - Add `metricsExperience.chartSectionError.title` ### Expected Results We're now able to see Discover's error component on a METRICS_INFO call error (Error description is custom for the demonstration) <img width="1584" height="709" alt="image" src="https://github.com/user-attachments/assets/81b7f380-8b07-4d36-b732-de7742c661f7" /> --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

) ## Summary Implements part of elastic/rna-program#430 This PR introduces `isEsqlUserError`: a small predicate that returns `true` for `ResponseError` with status 400 or 404, and applies it in two places: - **Main rule query** (`QueryService.executeQueryStream`): on a user error, wraps the thrown error with `createTaskRunError(..., TaskErrorSource.USER)` before rethrowing. Non-user errors (5xx, network, cancellation, Arrow parse errors) are rethrown unchanged. - **Recovery query** (`CreateRecoveryEventsStep`): same wrapping applied to the `recovery_policy.type === 'query'` execution path. --------- Co-authored-by: Christos Nasikas <xristosnasikas@gmail.com>

## Summary I noticed that we often approve the auto-generated PR, but we sometimes forget to enable merge/auto-merge, so the PR sits there and never merges. When that happens, the next weeks’ PRs don’t get generated, and the whole weekly chain stalls. Enabling auto-merge at PR creation time prevents this from being blocked by a missed manual step. This change updates `.buildkite/scripts/steps/console_definitions_sync.sh` so that after creating the Console definitions sync PR it: * Automatically enables auto-merge (squash) for that PR via gh pr merge --auto --squash. * Logs a warning if enabling auto-merge fails, without failing the step.

…70607) Adds an inline tool for the Agent Builder Automatic Troubleshooting skill `get_endpoint_artifacts`. This tool allows the agent to retrieve endpoint specific exception list items such as endpoint exceptions, trusted apps, blocklists, etc. The tool has a summary and detail mode to help prevent context explosion from artifacts. In order to support user scoped artifact fetching, a new `getScopedEndpointArtifactClient` was also added to the endpoint app context service as the existing `getExceptionListsClient` is not user scoped. Also includes some minor skill instructions tweaking to better handle endpoint artifacts.

## Summary - Updates the `search` rollback fixtures for model version 13 (`10.13.0.json`) to use `{ "$match": "uuid" }` for Discover session tab IDs instead of hardcoded UUIDs. - MV13 tab IDs are generated via `uuidv5(savedObjectId, …)` during migration, but rollback tests bulk-create documents without fixed IDs, so the tab ID changes on every CI run and caused false fixture mismatches on unrelated PRs. - Adds a **Saved object fixtures** section to `.github/CODEOWNERS`, assigning each `__fixtures__/<type>/` folder to the team that owns the corresponding registered SO type (derived from the registering plugin's `kibana.jsonc` owner or more specific CODEOWNERS paths). ## Test plan - [x] `node scripts/check_changes.ts` - [ ] CI: **Check changes in Saved Objects** rollback tests for `search` pass when MV13 is in scope --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Davis McPhee <davismcphee@hotmail.com>

…format (elastic#270927) ## Summary Updates `docs/extend/saved-objects/validate.md` to reflect the structured error reporting introduced in elastic#268469 and the additional validation rules added in subsequent PRs. ### What changed **Format**: The CI check now posts a structured PR comment (`**[rule-id]** Message. _Fix:_ …`) instead of raw `❌` terminal output. The troubleshooting intro is updated to explain the new format and show how to reproduce findings locally. **New rules documented** (were missing from the original section): | Rule ID | Introduced in | |---|---| | `existing-type/schema-breaking-changes` | elastic#268630 | | `existing-type/schema-undiffable-legacy-hash` | elastic#268630 | | `existing-type/new-mappings-not-in-model-version` | elastic#268630 | | `existing-type/keyword-missing-ignore-above` | elastic#268630 | | `existing-type/invalid-name-title-field-type` | elastic#268630 | | `new-type/missing-initial-model-version` | elastic#268469 | | `new-type/legacy-migrations` | elastic#268469 | | `new-type/keyword-missing-ignore-above` | elastic#270541 | | `new-type/invalid-name-title-field-type` | elastic#270541 | | `model-version/mappings-not-in-schema` | elastic#268630 | | `model-version/mapping-index-false` | elastic#268630 | | `model-version/mapping-enabled-false` | elastic#268630 | | `model-version/fixture-missing` | elastic#270541 | | `model-version/fixture-invalid` | elastic#270541 | | `documents/fixture-mismatch` | elastic#270541 | **Structure**: Rules are now grouped into categories (existing-type / new-type / model-version / documents / removed-type) with stable anchor IDs on every rule heading, so `([docs](link))` references in PR comments land directly on the right entry. ## Test plan - [ ] Visual review of rendered markdown in the Elastic docs preview Made with [Cursor](https://cursor.com) --------- Co-authored-by: Cursor <cursoragent@cursor.com>

This package is used by search playground Upgrades `ai` from 5.0.102 → 5.0.190 and `@ai-sdk/langchain` from 1.0.102 → 1.0.190, which transitively bumps `@ai-sdk/provider-utils` from 3.0.17 to 3.0.25. Adds a yarn resolution to force the updated version for all transitive consumers (including @arizeai/phoenix-client which pins ai@^5.0.38). ## Testing - Typecheck passes - Playground unit tests passing - confirm no older version is present: `find node_modules -path "*/provider-utils/package.json" -exec grep '"version"' {} \; -print` - Manually navigated search-playground in affected stack version, verified no breaking functionality changes: - Connected an index - Asked a question and confirmed api call returns a stream-event - Asked a follow-up to verify conversation history is maintained ## Backport - 9.3, 9.2 and 8.19 have the same format and automated backport should work fine - 9.1 has a different patch version (5.0.108) and manual backport will be created if necessary --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

@ymao1

…es.yml (elastic#271321) ## Summary Mirrors the index privilege changes from [elasticsearch-controller#1777](elastic/elasticsearch-controller#1777) (merged 2026-05-22 by @ymao1) into the Kibana serverless roles file. Two changes: - **Viewer role**: adds `read` on `.entity_analytics.entity-leads*` and `.entity_analytics.watchlists.*` (watchlists + entity leads visibility for read-only users) - **Asset-criticality write roles**: adds `view_index_metadata` on `.entities.v2.latest.security_*` for all roles that already have `write` on `.asset-criticality.asset-criticality-*`. Affected: `editor`, `platform_engineer`, `t2_analyst`, `t3_analyst`, `threat_intelligence_analyst`, `rule_author`, `endpoint_operations_analyst`, `endpoint_policy_manager`. Context: @simitt flagged the requirement to mirror controller changes into this file during controller PR review. The mismatch is not enforced at runtime but the file header explicitly states it should stay in sync. Made with [Cursor](https://cursor.com) Co-authored-by: Cursor <cursoragent@cursor.com>

github-actions · 2026-05-27T16:55:34Z

@smith, this PR increases one or more page-load bundle sizes by 15% or more:

Plugin	Before (bytes)	After (bytes)	Change
`agentBuilderPlatform`	8,737	15,544	+77.9%

Large bundle size increases can affect page load performance. Consider whether dependencies can be lazy-loaded or code split to reduce the bundle.

See the bundle optimization guide for tips.

…lyouts

github-actions · 2026-05-29T16:34:15Z

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

/oblt-deploy : Deploy a Kibana instance using the Observability test environments.
run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

smith requested review from a team as code owners May 26, 2026 14:13

smith marked this pull request as draft May 26, 2026 14:13

kibanamachine added 2 commits May 26, 2026 14:24

Changes from node scripts/lint_ts_projects --fix

93930a2

Changes from node scripts/regenerate_moon_projects.js --update

efa846f

smith and others added 22 commits May 27, 2026 11:54

[flaky test runner] limit to 50 runs per config (elastic#270984)

f9ace68

## Summary Limiting flaky test runner to 50x runs per config. We should be mindful with CI costs related to flaky tests investigation and 50 runs is more than enough to confirm the fix.

[APM] Fix SLO overview flyout details crash and alerts tab navigation (…

f9a6a5c

…elastic#271237) Co-authored-by: Cursor <cursoragent@cursor.com>

[Entity Store] Don't check privileges on negated indices (elastic#271273

23dc981

) ## Summary Negated indices (with leading `-`) are not part of the query, therefore we don't need to check them. Beyond that, it breaks `_has_privileges` es check.

Wl serverless ff (elastic#271255)

347318a

Update serverless to auto enable WL feature flag --------- Co-authored-by: Ying Mao <ying.mao@elastic.co>

Dosant and others added 23 commits May 27, 2026 11:54

Merge remote-tracking branch 'upstream/main' into nightshift-panels-f…

87da69d

…lyouts

smith added the ci:project-deploy-observability Create an Observability project label May 29, 2026

smith added Feature:SigEvents Significant events feature, related to streams and rules/alerts (RnA) backport:skip This PR does not require backporting release_note:skip Skip the PR/issue when compiling release notes labels May 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Nightshift] Add panels, flyouts, and detail components#271281

[Nightshift] Add panels, flyouts, and detail components#271281
smith wants to merge 116 commits into
elastic:mainfrom
smith:nightshift-panels-flyouts

smith commented May 26, 2026

Uh oh!

infra-vault-gh-plugin-prod Bot commented May 26, 2026

Uh oh!

kibanamachine commented May 26, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

smith commented May 26, 2026

Summary

Test plan

Uh oh!

infra-vault-gh-plugin-prod Bot commented May 26, 2026

Uh oh!

kibanamachine commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💔 Build Failed

Failed CI Steps

Metrics [docs]

Module Count

Async chunks

Page load bundle

History

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

github-actions Bot commented May 29, 2026

🤖 GitHub comments

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

kibanamachine commented May 26, 2026 •

edited

Loading