You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Closeselastic/obs-ai-team#606
## Summary
Adds a dedicated on-demand Buildkite pipeline
(`bk-kibana-evals-on-demand`) so any Elastic org member can run a single
eval suite + models on any branch without opening a PR or waiting for the
full Kibana PR pipeline. Reuses the existing eval runner
(`run_suite.sh`) and golden-cluster result export.
### Problem
Today, eval CI is Buildkite-only and PR-triggered via labels (`evals:*`
+ `models:*`). Engineers must wait for the full Kibana PR pipeline
before eval results are available, and there is no lightweight way to
run one suite on an arbitrary branch without that overhead.
#### Trigger flow
`kibana-evals-on-demand` → **New build** → pick branch/commit → set env
vars (`EVAL_SUITE_ID`, `EVAL_MODEL_GROUPS`, etc.)
#### Results
Golden cluster evals UI, filter by branch or run id
`bk-<buildkite_build_id>`.
#### Access
Everyone has `BUILD_AND_READ` (any org member can start builds);
`kibana-operations` and `obs-ai-team` retain `MANAGE_BUILD_AND_READ`.
---------
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
`evals init` walks you through EIS (Cloud Connected Mode) connector discovery or validates existing connectors in `kibana.dev.yml`. It outputs an `export KIBANA_TESTING_AI_CONNECTORS="..."` command to paste into your shell.
123
123
124
124
`evals start` orchestrates the full stack in one terminal:
125
+
125
126
1. Starts the EDOT collector (Docker) for trace capture -- exports traces to the configured tracing Elasticsearch cluster (via `TRACING_ES_URL`)
126
127
2. Starts Scout (ES + Kibana with `evals_tracing` config)
127
128
3. Enables EIS CCM on the Scout ES cluster (if using EIS connectors)
-`--datasets-profile <name>` loads `EVALUATIONS_KBN_URL` / `EVALUATIONS_KBN_API_KEY` from `config.<name>.json`
165
167
-`--export-profile <name>` loads `EVALUATIONS_ES_URL`, `TRACING_ES_URL`, and `TRACING_EXPORTERS` from `config.<name>.json`
166
168
@@ -210,6 +212,35 @@ The CLI uses suite metadata from:
210
212
.buildkite/pipelines/evals/evals.suites.json
211
213
```
212
214
215
+
### On-demand evals (Buildkite)
216
+
217
+
Run a single suite and model on any branch without opening a PR or waiting for the full Kibana PR pipeline:
218
+
219
+
1. Open [kibana-evals-on-demand](https://buildkite.com/elastic/kibana-evals-on-demand) on Buildkite
220
+
2. Click **New build**, select the branch (or commit) to evaluate
221
+
3. Under **Environment variables** (in New build options), add the required variables below — one `KEY=value` per line. These are build-level env vars read by `run_suite.sh`.
0 commit comments