[OPIK-4987] [FE][BE] feat: integrate Ollie Console sidebar into Opik frontend by Nimrod007 · Pull Request #5680 · comet-ml/opik

Nimrod007 · 2026-03-16T13:46:11Z

Summary

Add Ollie Console chat sidebar to Opik frontend behind OLLIE_CONSOLE_ENABLED feature toggle
Uses the comet plugin system — zero ollie code in the open-source codebase
Plugin loads @comet-ml/ollie-sidebar library from private comet-ml/ollie-console repo
Toggle defaults to false — zero impact on open-source users

What changed

Backend:

ollieConsoleEnabled field in ServiceTogglesConfig.java
Config entry in config.yml via TOGGLE_OLLIE_CONSOLE_ENABLED env var

Frontend (open-source):

OLLIE_CONSOLE_ENABLED feature toggle (enum + default state)
OllieSidebar slot added to PluginsStore (null for OSS)
PageLayout renders plugin if provided, with lazy-load fallback for dev mode
Content area adjusts width via --ollie-sidebar-width CSS variable
Vite resolve.alias for React to prevent duplicate instances

Frontend (comet plugin — private):

plugins/comet/OllieSidebar.tsx — the actual implementation
- Lazy-loads ChatSidebar from @comet-ml/ollie-sidebar
- CSS loaded at runtime via ?raw to bypass Tailwind 3 PostCSS
- Open/close state persisted in localStorage
- Feature-toggled via OLLIE_CONSOLE_ENABLED

Architecture

Open-source Opik                      Private (comet plugin)
┌─────────────────────────┐    ┌──────────────────────────────────┐
│ PluginsStore:           │    │ plugins/comet/OllieSidebar.tsx   │
│   OllieSidebar = null   │◄───│   lazy-loads @comet-ml/ollie-   │
│                         │    │   sidebar from npm/CDN           │
│ PageLayout:             │    │   passes bridge API (auth,       │
│   plugin ?? fallback    │    │   workspace, theme)              │
└─────────────────────────┘    └──────────────────────────────────┘
                                           │
                                           ▼
                              comet-ml/ollie-console repo
                              (separate release cycle)

Local development setup

Prerequisites: Clone both repos as siblings in the same parent directory:

your-code-dir/
├── opik/                    # this repo
└── ollie-console/           # git clone git@github.com:comet-ml/ollie-console.git

Step 1: Build the ollie-sidebar library

cd ollie-console
npm install
cd packages/ollie-sidebar
npm run build    # produces dist/index.mjs, dist/index.js, dist/styles.css

Step 2: Install Opik frontend dependencies

cd opik/apps/opik-frontend
npm install      # resolves @comet-ml/ollie-sidebar via file: reference to ../../../ollie-console/packages/ollie-sidebar

Step 3: Enable the feature toggle

In apps/opik-backend/config.yml, temporarily change the default:

ollieConsoleEnabled: ${TOGGLE_OLLIE_CONSOLE_ENABLED:-"true"}

Step 4: Start everything

cd opik
./scripts/dev-runner.sh --restart

Open http://localhost:5174 — the Ollie sidebar appears on the right.

After changing ollie-sidebar code:

cd ollie-console/packages/ollie-sidebar
npm run build    # rebuild library
# Vite HMR picks up the changes automatically

Test plan

Frontend type-checks cleanly (tsc --noEmit)
Frontend builds cleanly (vite build)
Toggle off: No sidebar rendered, no console errors
Toggle on: Sidebar renders at 380px, content area shrinks
No duplicate React errors (verified with Playwright)
Plugin system: OllieSidebar loaded via plugin in comet mode
Dev mode fallback: OllieSidebar lazy-loaded directly when plugin not active
Backend builds cleanly

🤖 Generated with Claude Code

…frontend Add Ollie Console chat sidebar behind OLLIE_CONSOLE_ENABLED feature toggle. The sidebar renders as a 380px right panel using the @comet-ml/ollie-sidebar library from the private ollie-console repo. Backend: - Add ollieConsoleEnabled to ServiceTogglesConfig (default: false) Frontend: - Add OLLIE_CONSOLE_ENABLED feature toggle - Create OllieSidebar wrapper with lazy loading (React.lazy) - Integrate into PageLayout with dynamic width via CSS variables - Load ollie CSS at runtime (?raw) to bypass Tailwind 3 PostCSS - Alias React in Vite config to prevent duplicate React instances - Install @comet-ml/ollie-sidebar via file: reference (local dev) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-03-16T13:46:27Z

📋 PR Linter Failed

❌ Missing Section. The description is missing the ## Details section.

❌ Missing Section. The description is missing the ## Change checklist section.

❌ Missing Section. The description is missing the ## Issues section.

❌ Missing Section. The description is missing the ## Testing section.

❌ Missing Section. The description is missing the ## Documentation section.

github-actions · 2026-03-16T13:48:37Z

Backend Tests - Unit Tests

1 496 tests 1 494 ✅ 54s ⏱️
180 suites 2 💤
180 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T13:49:23Z

Backend Tests - Integration Group 7

1 137 tests 1 136 ✅ 6m 2s ⏱️
9 suites 1 💤
9 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T13:50:00Z

Backend Tests - Integration Group 15

172 tests 170 ✅ 3m 44s ⏱️
27 suites 2 💤
27 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T13:50:04Z

Backend Tests - Integration Group 5

251 tests 251 ✅ 2m 17s ⏱️
26 suites 0 💤
26 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T13:50:05Z

Backend Tests - Integration Group 11

23 files 23 suites 3m 47s ⏱️
131 tests 131 ✅ 0 💤 0 ❌
113 runs 113 ✅ 0 💤 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T13:50:13Z

Backend Tests - Integration Group 12

184 tests 182 ✅ 6m 0s ⏱️
36 suites 2 💤
36 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T13:50:29Z

Backend Tests - Integration Group 8

280 tests 280 ✅ 4m 46s ⏱️
22 suites 0 💤
22 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T13:50:43Z

Backend Tests - Integration Group 13

445 tests 442 ✅ 3m 45s ⏱️
21 suites 3 💤
21 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T13:50:46Z

Backend Tests - Integration Group 16

16 files 16 suites 1m 0s ⏱️
187 tests 187 ✅ 0 💤 0 ❌
165 runs 165 ✅ 0 💤 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T13:51:23Z

Backend Tests - Integration Group 6

105 tests 105 ✅ 2m 47s ⏱️
23 suites 0 💤
23 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T13:52:49Z

Backend Tests - Integration Group 9

322 tests 321 ✅ 8m 48s ⏱️
24 suites 1 💤
24 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T13:52:51Z

Backend Tests - Integration Group 4

1 362 tests 1 362 ✅ 8m 49s ⏱️
5 suites 0 💤
5 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T13:53:14Z

Backend Tests - Integration Group 10

253 tests 251 ✅ 7m 7s ⏱️
21 suites 2 💤
21 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T13:53:17Z

Backend Tests - Integration Group 3

313 tests 313 ✅ 9m 49s ⏱️
29 suites 0 💤
29 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T13:53:59Z

Backend Tests - Integration Group 1

413 tests 413 ✅ 13m 13s ⏱️
24 suites 0 💤
24 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

baz-reviewer · 2026-03-16T13:55:36Z

apps/opik-backend/config.yml

  agentConfigurationEnabled: ${TOGGLE_AGENT_CONFIGURATION_ENABLED:-"false"}
+  # Default: false
+  # Description: Whether or not Ollie Console sidebar is enabled
+  ollieConsoleEnabled: ${TOGGLE_OLLIE_CONSOLE_ENABLED:-"false"}


ollieConsoleEnabled was added as a feature flag but its comment doesn't follow the required template and omits explicit metadata: default false, units/format boolean, the component it gates (Ollie Console sidebar in the Opik UI), scope global, the operational/safety rationale, and a suggested safe range. This omission makes the flag not production-safe or discoverable and risks it being enabled before FE and backend are ready. Can we update the comment to explicitly state default false, boolean format, that it gates the Ollie Console sidebar in the Opik UI, its global scope, the safety rationale (disabling preserves the current sidebar; enabling requires FE rollout and a backend feature flag), and recommend defaulting to false until FE and backend ship?

_{Finding type: Config defaults and compatibility | Severity: 🟢 Low}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In apps/opik-backend/config.yml around lines 775 to 778, the configuration key `ollieConsoleEnabled` currently has an incomplete comment. Replace the current two-line comment with a standard template comment that explicitly states: Default: false; Units/format: boolean toggle; Component/behavior gated: Ollie Console sidebar in the Opik UI; Scope: global default (backend flag); Operational impact/safety rationale: disabling preserves the existing sidebar experience, enabling requires coordinated FE rollout and backend flag activation; Suggested safe range: false until both frontend and backend changes are deployed. Keep the same comment style/indentation as the neighboring `agentConfigurationEnabled` entry.

Commit cc50b4f addressed this comment by removing the ollieConsoleEnabled toggle and its incomplete comment from apps/opik-backend/config.yml, eliminating the need for the requested metadata.

baz-reviewer · 2026-03-16T13:55:36Z

apps/opik-frontend/src/plugins/comet/OllieSidebar.tsx

+const loadOllieCss = () => {
+  const id = "ollie-sidebar-styles";
+  if (document.getElementById(id)) return;
+
+  import("@comet-ml/ollie-sidebar/styles.css?raw").then((css) => {
+    const style = document.createElement("style");
+    style.id = id;
+    style.textContent = css.default;
+    document.head.appendChild(style);
+  });
+};
+


The component uses magic literals for config: useLocalStorageState("ollie-sidebar-open"…), a style element id = "ollie-sidebar-styles", and the collapsed width 32. Hardcoding these values risks inconsistencies and makes changes error‑prone. Can we extract them into named constants (e.g. OLLIE_SIDEBAR_STORAGE_KEY, OLLIE_SIDEBAR_STYLE_ID, OLLIE_SIDEBAR_COLLAPSED_WIDTH) and reference those instead?

_{Finding type: Avoid hardcoded configuration values | Severity: 🟢 Low}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In apps/opik-frontend/src/components/layout/OllieSidebar/OllieSidebar.tsx around lines 13 to 45, the code currently uses hardcoded literals: the localStorage key "ollie-sidebar-open", the style element id "ollie-sidebar-styles", and the collapsed width literal 32. Refactor by adding named constants near the top of the file (for example OLLIE_SIDEBAR_STORAGE_KEY, OLLIE_SIDEBAR_STYLE_ID, and OLLIE_SIDEBAR_COLLAPSED_WIDTH), replace the raw string/number occurrences in loadOllieCss, the useLocalStorageState call, and the onWidthChange call with these constants, and ensure the constants are exported or documented if they will be reused elsewhere.

Commit cc50b4f addressed this comment by removing the OllieSidebar plugin file entirely, so the hardcoded storage key, style ID, and collapsed width literals no longer exist in that component, resolving the concern about extracting those values into constants.

baz-reviewer · 2026-03-16T13:55:36Z

apps/opik-frontend/src/plugins/comet/OllieSidebar.tsx

+  const [isOpen, setIsOpen] = useLocalStorageState("ollie-sidebar-open", {
+    defaultValue: true,
+  });
+
+  useEffect(() => {
+    if (!isEnabled) {
+      onWidthChange(0);
+      return;
+    }
+    onWidthChange(isOpen ? OLLIE_SIDEBAR_WIDTH : 32);
+  }, [isEnabled, isOpen, onWidthChange]);
+


isOpen is only set to false in handleClose and never reset to true, so after closing the sidebar onWidthChange keeps emitting 32px and the layout stays collapsed. Can we switch to a controlled open prop or add an onOpen callback that calls setIsOpen(true) so the width toggles back to 380px when the sidebar reopens?

_{Finding types: prefer direct React patterns Logical Bugs | Severity: 🔴 High}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In apps/opik-frontend/src/components/layout/OllieSidebar/OllieSidebar.tsx around lines 35-61, the local state `isOpen` (from useLocalStorageState("ollie-sidebar-open")) is only set to false via handleClose and never set back to true when ChatSidebar reopens, so onWidthChange keeps emitting 32 and the layout never expands. Refactor so the ChatSidebar is controlled: pass an explicit `open={isOpen}` prop (instead of defaultOpen) and add an onOpen/onToggle handler that calls `setIsOpen(true)` when the sidebar opens. Also ensure the useEffect that calls onWidthChange reads the controlled open state so width is set to OLLIE_SIDEBAR_WIDTH when the user opens the sidebar and back to 32 when closed. Preserve the localStorage initial state (use the current defaultValue) and remove reliance on defaultOpen for future toggles.

Commit cc50b4f addressed this comment by deleting apps/opik-frontend/src/plugins/comet/OllieSidebar.tsx, which removed the affected component and eliminated the previous ChatSidebar state management issue.

baz-reviewer · 2026-03-16T13:55:36Z

apps/opik-frontend/src/plugins/comet/OllieSidebar.tsx

+  return (
+    <div className="absolute right-0 top-[var(--banner-height)] bottom-0 z-10">
+      <Suspense>
+        <ChatSidebar onClose={handleClose} defaultOpen={isOpen} />
+      </Suspense>


The UI renders empty while ChatSidebar lazy-loads because the surrounding <Suspense> has no fallback. Per .agents/skills/opik-frontend/performance.md this omits the required placeholder; can we add an explicit fallback to <Suspense> (even null or a small skeleton)?

_{Finding type: AI Coding Guidelines | Severity: 🟢 Low}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In apps/opik-frontend/src/components/layout/OllieSidebar/OllieSidebar.tsx around lines 57 to 61, the Suspense wrapper for ChatSidebar is rendered without a fallback, causing an empty render while the lazy chunk loads. Update the return so Suspense includes an explicit fallback prop (for example a small skeleton placeholder div or a concise loading indicator, or at minimum fallback={null}) and ensure the placeholder matches the sidebar dimensions/positioning so the layout doesn't shift. Keep the change local to the JSX return (no major refactor) and prefer a small accessible placeholder component if available.

github-actions · 2026-03-16T13:57:48Z

Backend Tests - Integration Group 14

267 tests 267 ✅ 8m 48s ⏱️
29 suites 0 💤
29 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T14:00:45Z

TS SDK E2E Tests - Node 18

238 tests 236 ✅ 18m 15s ⏱️
25 suites 2 💤
1 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T14:02:35Z

TS SDK E2E Tests - Node 20

238 tests 236 ✅ 18m 0s ⏱️
25 suites 2 💤
1 files 0 ❌

Results for commit 5d49549.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T14:03:14Z

Python SDK E2E Tests Results (Python 3.10)

238 tests ±0 236 ✅ ±0 9m 45s ⏱️ +47s
1 suites ±0 2 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit 5d49549. ± Comparison against base commit def49d0.

This pull request removes 1 and adds 1 tests. Note that renamed tests count towards both.

tests.e2e.test_tracing ‑ test_opik_client__update_trace__happy_flow[None-None-None-None-019d06fb-aea2-73a8-8a19-20ba83a7ff2b]

tests.e2e.test_tracing ‑ test_opik_client__update_trace__happy_flow[None-None-None-None-019d0b5f-7f4b-739d-8313-606b854df7b4]

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T14:03:39Z

Python SDK E2E Tests Results (Python 3.13)

238 tests ±0 236 ✅ ±0 10m 47s ⏱️ + 1m 32s
1 suites ±0 2 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit 5d49549. ± Comparison against base commit def49d0.

This pull request removes 1 and adds 1 tests. Note that renamed tests count towards both.

tests.e2e.test_tracing ‑ test_opik_client__update_trace__happy_flow[None-None-None-None-019d06f9-82af-7745-a955-52411673549d]

tests.e2e.test_tracing ‑ test_opik_client__update_trace__happy_flow[None-None-None-None-019d0b5e-820a-7c57-99df-49da4a2c8e5e]

♻️ This comment has been updated with latest results.

github-actions · 2026-03-16T14:04:37Z

Python SDK E2E Tests Results (Python 3.11)

238 tests ±0 236 ✅ ±0 9m 43s ⏱️ +20s
1 suites ±0 2 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit 5d49549. ± Comparison against base commit def49d0.

This pull request removes 1 and adds 1 tests. Note that renamed tests count towards both.

tests.e2e.test_tracing ‑ test_opik_client__update_trace__happy_flow[None-None-None-None-019d06f8-1eb9-7fab-a908-4b947aa7f0e8]

tests.e2e.test_tracing ‑ test_opik_client__update_trace__happy_flow[None-None-None-None-019d0b5f-8ddb-7ab4-97b2-ca22b6252d85]

♻️ This comment has been updated with latest results.

…lback lookups (#5760) * [OPIK-4938] [BE] Add project-scoped endpoint tests for prompts, datasets, experiments, and dashboards Add FindProjectPrompts, FindProjectDatasets, FindProjectExperiments, and FindProjectDashboards nested test classes that mirror their workspace-scoped counterparts, exercising the /v1/private/projects/{projectId}/{resource} endpoints with filtering, pagination, and sorting coverage. Also extract shared assertion helpers (assertPromptsPage, assertDashboardPage) to outer test classes for reuse, and add getProjectDashboards/getProjectPrompts client methods to the respective resource clients. * Revision 2: Fix missing @RequiredPermissions on ProjectPromptsResource and rename duplicate "By" in test method names * Revision 3: Remove @RequiredPermissions from ProjectPromptsResource that caused test compilation issue * Revision 4: Apply spotless formatting * [OPIK-4938] [BE] Add X-Opik-Deprecation header for workspace-wide fallback lookups When a dataset, prompt, or dashboard is found via workspace-wide search (i.e., the requested projectId did not match but a workspace-wide record exists), the response now includes X-Opik-Deprecation with a formatted message warning that explicit project scoping will be required in a future version. * Revision 2: Add tests for X-Opik-Deprecation workspace-fallback header Tests verify: - Header is returned with full formatted message when fallback to workspace-wide search is used (non-matching/non-existent project) - Header is absent when the entity is found directly in the requested project * Revision 3: Use WORKSPACE_FALLBACK_MESSAGE_TEMPLATE constant in tests * Revision 4: Fix fallback message when project name is provided but does not exist * Revision 5: Remove workspace fallback message from DashboardService (no expose endpoint) * Revision 6: Extract setWorkspaceFallbackFor helper to centralize fallback message setting * Revision 7: Fix OutOfScopeException in reactive stream by introducing resolveDatasetByName Split the blocking findByName(DatasetIdentifier) from the reactive stream path. Added resolveDatasetByName(DatasetIdentifier, Visibility) to the DatasetService interface that captures workspaceId on the request thread and returns a Mono<Dataset>, so the reactive chain in DatasetItemService no longer accesses the @RequestScoped RequestContext from a non-request thread. --------- Co-authored-by: Andres Cruz <andresc@comet.com>

#5765)

…_config() (#5751) * [NA] [SDK] feat: require @track context when calling get_agent_config() Raises RuntimeError if get_agent_config() is called outside a function decorated with @opik.track, replacing the previous approach of emitting a warning on attribute access when the mask didn't match. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix tests * use opik_context * use context manager in tests --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

- Fix breadcrumb to show experiment name instead of static "Compare" - Rename "Legacy dataset" label to "Dataset" in evaluation suites - Hide item ID column by default on compare experiments page - Open global settings dialog on global assertions tag click

… correctness (#5726) * [OPIK-5050] [BE] fix: replace FINAL with LIMIT 1 BY in trace thread queries and add exponential backoff Replace FINAL with LIMIT 1 BY subqueries in TraceThreadDAO to avoid full table scans on the closing job hot path. Push status filter into subquery for FIND_PENDING_CLOSURE_THREADS_SQL to skip inactive rows early. Fix correctness issue in FIND_THREADS_BY_PROJECT_SQL where mutable column filters were applied before deduplication, potentially returning stale results. Add exponential backoff to TraceThreadsClosingJob to prevent hammering ClickHouse on consecutive failures. * [OPIK-5050] [BE] chore: address PR review feedback - Simplify LIMIT 1 BY (workspace_id, project_id, thread_id, id) to LIMIT 1 BY id in FIND_PENDING_CLOSURE_THREADS_SQL and OPEN_CLOSURE_THREADS_SQL for consistency with rest of codebase - Add SETTINGS log_comment to FIND_THREADS_BY_PROJECT_SQL, FIND_PENDING_CLOSURE_THREADS_SQL, and OPEN_CLOSURE_THREADS_SQL for query observability in ClickHouse - Fix log placeholder formatting to use single-quoted '{}' per project conventions * [OPIK-5050] [BE] chore: fix import ordering (spotless) * [OPIK-5050] [BE] fix: escape angle brackets in SQL to prevent StringTemplate interpolation The < and > SQL comparison operators in FIND_PENDING_CLOSURE_THREADS_SQL were being interpreted as StringTemplate delimiters after switching from raw string to getSTWithLogComment, silently corrupting the query. * [OPIK-5050] [BE] fix: revert log_comment on queries with SQL angle brackets FIND_PENDING_CLOSURE_THREADS_SQL and OPEN_CLOSURE_THREADS_SQL contain SQL < and > operators which StringTemplate interprets as template delimiters, silently corrupting the rendered query. Revert these two queries to use raw strings like the original code. Keep log_comment on FIND_THREADS_BY_PROJECT_SQL which only uses ST template expressions. * [OPIK-5050] [BE] perf: use time-bounded FINAL with minmax skip index for closing job query Replace LIMIT 1 BY approach with time-bounded FINAL + minmax skip index on last_updated_at. The closing job query now only scans recent granules instead of the entire trace_threads table, reducing granules read from 369/369 to 12/369 (97% reduction) in benchmarks. - Add cached getMaxTimeoutMarkThreadAsInactive to compute lookback window - Bind cached_max_inactive_period parameter in DAO - Add use_skip_indexes_if_final=1 SETTINGS to enable skip index with FINAL - Add cache config for max_timeout (30min TTL) * [OPIK-5050] [BE] perf: add cold start lookback, GROUP BY optimization, increase default job interval - Add 7-day cold start lookback on first run after startup to catch threads that became stale during outages - Normal lookback floor: max(maxTimeout + 1h, 1 day) via minmax skip index - GROUP BY workspace_id, project_id, status with min(last_updated_at) in subquery to reduce rows before workspace_configurations JOIN - Increase default OPIK_CLOSE_TRACE_THREAD_JOB_INTERVAL from 3s to 15s - Add minmax skip index migration for last_updated_at (GRANULARITY 1) * [OPIK-5050] [BE] chore: add cache config docs, bump lock time to match interval - Add documentation comment for max_timeout_mark_thread_as_inactive cache - Increase closeTraceThreadJobLockTime from 4s to 14s to match the 15s job interval (prevents premature lock release with more accumulated work) * [OPIK-5050] [BE] chore: update MAX_BACKOFF_EXPONENT comment for 15s interval * [OPIK-5050] [BE] chore: improve backoff comment, use LEFT ANY JOIN for workspace config - Clarify MAX_BACKOFF_EXPONENT comment to describe doubling pattern - Use LEFT ANY JOIN for workspace_configurations (at most one row per workspace after FINAL dedup, communicates intent and is slightly more efficient) * [OPIK-5050] [BE] fix: move success handler to onComplete, add migration comment - Mono<Void> never emits onNext, so completedFirstRun/backoff reset was unreachable. Move success logic to onComplete (3rd subscribe arg). - Add --comment to migration file per convention. * [OPIK-5050] [BE] chore: move cold-start lookback and max backoff exponent to config Move COLD_START_LOOKBACK and MAX_BACKOFF_EXPONENT from hardcoded constants to TraceThreadConfig, backed by env vars OPIK_CLOSE_TRACE_THREAD_COLD_START_LOOKBACK (default 7d) and OPIK_CLOSE_TRACE_THREAD_MAX_BACKOFF_EXPONENT (default 5). Also fix log message in onComplete handler ("started" -> "completed"). * [OPIK-5050] [BE] fix: rename migration 71 -> 73 to avoid conflict with main * [OPIK-5050] [BE] chore: remove benchmark script, expand config docs

* [OPIK-4714] add new permission * [OPIK-4714] add permission checks * [OPIK-4714] add checks for dataset items * [OPIK-4714] Add missing imports * [OPIK-4714] refactor * [OPIK-4714] fix after merge

Co-authored-by: Andres Cruz <andresc@comet.com>

) Bumps [com.diffplug.spotless:spotless-maven-plugin](https://github.com/diffplug/spotless) from 3.3.0 to 3.4.0. - [Release notes](https://github.com/diffplug/spotless/releases) - [Changelog](https://github.com/diffplug/spotless/blob/main/CHANGES.md) - [Commits](diffplug/spotless@lib/3.3.0...maven/3.4.0) --- updated-dependencies: - dependency-name: com.diffplug.spotless:spotless-maven-plugin dependency-version: 3.4.0 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Andres Cruz <andresc@comet.com>

#5780) Bumps [org.jdbi:jdbi3-stringtemplate4](https://github.com/jdbi/jdbi) from 3.51.0 to 3.52.0. - [Release notes](https://github.com/jdbi/jdbi/releases) - [Changelog](https://github.com/jdbi/jdbi/blob/master/RELEASE_NOTES.md) - [Commits](jdbi/jdbi@v3.51.0...v3.52.0) --- updated-dependencies: - dependency-name: org.jdbi:jdbi3-stringtemplate4 dependency-version: 3.52.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Andres Cruz <andresc@comet.com>

…cs (#5784) * add[docs]: section for running online evals retrospectively * Optimised images with calibre/image-actions --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

…istence (#5787) * [OPIK-4966] [FE] refactor: extract shared deps from v1 for v1/v2 coexistence Move components, hooks, types, and constants out of v1/ into shared locations so both v1 and v2 can use the same instances. This prevents broken contexts and duplicated code when v2 is cloned from v1. Extracted to shared: - PageBodyStickyContainer → src/shared/ - PageBodyScrollContainer context → src/contexts/ - BaseTraceDataTypeIcon → src/shared/ - VerticallySplitCellWrapper → src/shared/ - UserComment folder → src/shared/ - GoogleColabCardCore → src/shared/ - ConfigurationType, GoogleColabCardCoreProps → src/types/shared - ProviderGridOption → src/types/providers - WorkspacePreference types/constants → src/constants/workspace-preferences - theme-provider, server-sync-provider, feature-toggles-provider → src/contexts/ - integration-scripts, integration-logs → src/constants/ Moved to v1 (not shared, has v1 deps): - TraceCountCell → v1/pages-shared/traces/ - PromptImprovementDialog → v1/pages-shared/llm/ Updated dependency-cruiser: - Broadened no-shared-importing-pages to block all v1/v2 imports - Added no-shared-infra-importing-versioned rule - Removed stale v1 provider exceptions from hooks rule Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [OPIK-4966] [FE] chore: remove stale dependency-cruiser violation rules Cleanup outdated dependency-cruiser exceptions for `no-hooks-importing-components` and `no-shared-importing-pages`, aligning with recent refactors. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…v2 (#5791) Clone all v1 page components, layout, pages-shared, and router into the v2 directory structure. Update all imports from @/v1/ to @/v2/ within cloned files. Update dependency-cruiser known violations baseline to include v2 copies of pre-existing circular deps. This establishes the v2 code base that will be independently modified for project-first navigation. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@Schema

…optimization write endpoints (#5772) * [OPIK-4938] [BE] Add project_name support to dataset, experiment and optimization endpoints - Add project_name field to DatasetItemBatch and propagate it through DatasetItemService to scope dataset items to a project - Resolve project_name to project_id in DatasetService and ExperimentService when creating datasets/experiments - Add project_id column to optimizations table via Liquibase migration (000073) - Expose project_id (read-only) and project_name (write-only) on Optimization model - Fix NullPointerException in OptimizationService when project_name is null by using AbstractMap.SimpleEntry instead of Map.entry - Add integration tests for project-scoped dataset creation in DatasetsResourceTest, ExperimentsResourceTest, and OptimizationsResourceTest * Revision 2: Add ProjectOptimizationsResource for project-scoped optimization listing * Revision 3: Add project_id filter to OptimizationsResource and integration tests for ProjectOptimizationsResource * [OPIK-4938] [BE] Fix DatasetItemBatch project resolution and centralize test factory calls Fix Reactor empty Mono bug in DatasetItemService.getDatasetId() where batches without projectId/projectName caused the flatMap to never execute (data loss). Added switchIfEmpty to handle the null-project case properly. Centralize factory.manufacturePojo(DatasetItemBatch.class) and factory.manufacturePojo(DatasetItem.class) calls in DatasetResourceClient to null out server-assigned fields (projectId, projectName, datasetId, etc.), preventing PODAM-generated random UUIDs from causing 404 errors in tests. * Revision 2: Address PR review comments - Rename migration 000073 → 000074 to avoid prefix conflict with main - Add trailing blank line to migration file per guidelines - Remove @RequiredPermissions(EXPERIMENT_VIEW) from ProjectOptimizationsResource.find() to match unrestricted access pattern of the global endpoint - Add dataset_name query param to ProjectOptimizationsResource.find() for parity with global /v1/private/optimizations endpoint - Fix @Schema description on DatasetItemBatch.projectId (was "dataset_name must be provided", now "project_name must be provided") - Use DatasetItemBatch builder instead of positional constructor in DatasetExportJobSubscriberResourceTest * Revision 3: Fix insertInvalidDatasetItemWorkspace test failure Use DatasetResourceClient helpers and null out datasetId to avoid PODAM generating random UUIDs that cause 404s in DatasetItemService resolution. * Revision 4: Simplify resolveProjectId in OptimizationService Inline context accesses inside fromCallable lambda, consistent with DatasetItemService.resolveProjectId pattern. * Revision 5: Support projectId on Optimization write + validate on upsert Remove READ_ONLY from Optimization.projectId so callers can pass it directly. Add a resolveProjectId branch that validates the provided projectId exists in the workspace before using it, mirroring the DatasetItemBatch projectId/projectName duality. * Revision 6: Clarify projectName/projectId as optional in DatasetItemBatch schema Both fields are optional (both null = no project scoping). Update @Schema descriptions to remove misleading "must be provided" language and describe precedence rules instead. * Revision 7: Extract resolveProjectIdOrCreate into ProjectService Both OptimizationService and ExperimentService had identical inline logic for resolving a project from (projectId, projectName): validate the id if provided, getOrCreate from the name otherwise, return empty if neither. The shared helper lives in ProjectService.resolveProjectIdOrCreate and uses deferContextual so callers no longer need to extract workspaceId/userName themselves. ExperimentService's logic is also aligned to projectId-first priority, consistent with OptimizationService and DatasetItemService. * Revision 8: Add OpenAPI schema descriptions for Optimization project_id/project_name Matches the existing descriptions on Experiment, making the auto-create and precedence semantics visible in the generated API docs. * [OPIK-4938] [BE] Fix trailing blank line in migration 000074

@provides

#5691) * [OPIK-5019] [BE] feat: add LLM model registry service and API endpoint Add a YAML-based model registry that loads supported LLM models at startup from a classpath resource, with optional local override file for self-hosted customers. Expose via GET /v1/private/llm/models. - LlmModelDefinition record with id, qualifiedName, structuredOutput, reasoning - LlmModelRegistryService loads/merges/caches from YAML - LlmModelsResource REST endpoint - llm-models-default.yaml with 525 models across 5 providers - 52 reasoning models tagged (OpenAI o-series, DeepSeek R1, QwQ, :thinking) - 9 unit tests covering load, merge, override, reload, immutability - Guice wiring in LlmModule, config in OpikConfiguration + config.yml No changes to existing routing or frontend — additive only. Implements OPIK-5019: [BE] Add LLM model registry service and API endpoint Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(registry): narrow reload() catch and guard null lists in merge - Catch only UncheckedIOException | IllegalStateException in reload() instead of broad Exception - Skip null/empty override lists in merge() to prevent NPE from malformed customer YAML Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(registry): make LlmModelRegistryService self-injectable via Guice Integration tests that disable LlmModule (e.g. AutomationRuleEvaluatorsResourceTest, ManualEvaluationResourceTest) failed because vyarus auto-config discovered LlmModelsResource but had no binding for LlmModelRegistryService. Move from @provides in LlmModule to @Inject @singleton on the service itself, with a package-private constructor for unit tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(registry): address PR review feedback - Remove @JsonProperty annotations that broke global snake_case convention in HTTP responses - Move LlmModelDefinition to com.comet.opik.api (response DTO, not infrastructure) - Add null/blank id guard in merge() for both default and override entries - Add @nonnull on merge() parameters - Add // visible for testing comment on package-private constructor - Add comment on volatile explaining scheduled refresh intent (OPIK-5020) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(registry): deep-copy provider lists for full immutability Map.copyOf() only makes the outer map immutable; the List values from Jackson are mutable ArrayLists. Added immutable() helper to wrap each list with List.copyOf() in the load() fast paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fixed formatting --------- Co-authored-by: Andrei Căutișanu <andreicautisanu@Andreis-MacBook-Pro.local> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@min

* [OPIK-4891] [BE] Data retention policy enforcement job - Add retention rule CRUD endpoints and scheduled enforcement job - Delete only traces and spans (children first); feedback scores and comments are left as lightweight orphans (~3% of storage) - 3-day sliding window [cutoff-3d, cutoff) for incremental processing - Experiment exclusion: traces/spans linked to experiments are protected via NOT IN subquery with allow_nondeterministic_mutations=1 - Cutoff normalized to start-of-day UTC using InstantToUUIDMapper - Two delete patterns: applyToPast=true (simple IN) and applyToPast=false (per-workspace OR conditions with max(minId, cutoff-3d)) - UUID v7 range partitioning splits workspace space across N fractions/day - Distributed locking via LockService for multi-instance safety - lightweight_deletes_sync=1 ensures mutations complete before returning * Make sliding window days configurable via retention.slidingWindowDays * Bump migration to 000059, fix slidingWindowDays @min(1) * Fix test failures and address PR review comments - Fix RetentionRulesResourceTest: applyToPast default is now true - Fix RetentionPolicyServiceTest: remove feedback_scores/comments assertions (not part of retention deletion), add Awaitility waits for ClickHouse async write consistency - Make organizationLevel write-only in RetentionRule (excluded from read responses since it's only used on create) - Wrap log placeholders in single quotes per codebase convention Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

)

@Schema

…5788) * [OPIK-4938] [BE] Add project_name support to dataset, experiment and optimization endpoints - Add project_name field to DatasetItemBatch and propagate it through DatasetItemService to scope dataset items to a project - Resolve project_name to project_id in DatasetService and ExperimentService when creating datasets/experiments - Add project_id column to optimizations table via Liquibase migration (000073) - Expose project_id (read-only) and project_name (write-only) on Optimization model - Fix NullPointerException in OptimizationService when project_name is null by using AbstractMap.SimpleEntry instead of Map.entry - Add integration tests for project-scoped dataset creation in DatasetsResourceTest, ExperimentsResourceTest, and OptimizationsResourceTest * Revision 2: Add ProjectOptimizationsResource for project-scoped optimization listing * Revision 3: Add project_id filter to OptimizationsResource and integration tests for ProjectOptimizationsResource * [OPIK-4938] [BE] Fix DatasetItemBatch project resolution and centralize test factory calls Fix Reactor empty Mono bug in DatasetItemService.getDatasetId() where batches without projectId/projectName caused the flatMap to never execute (data loss). Added switchIfEmpty to handle the null-project case properly. Centralize factory.manufacturePojo(DatasetItemBatch.class) and factory.manufacturePojo(DatasetItem.class) calls in DatasetResourceClient to null out server-assigned fields (projectId, projectName, datasetId, etc.), preventing PODAM-generated random UUIDs from causing 404 errors in tests. * Revision 2: Address PR review comments - Rename migration 000073 → 000074 to avoid prefix conflict with main - Add trailing blank line to migration file per guidelines - Remove @RequiredPermissions(EXPERIMENT_VIEW) from ProjectOptimizationsResource.find() to match unrestricted access pattern of the global endpoint - Add dataset_name query param to ProjectOptimizationsResource.find() for parity with global /v1/private/optimizations endpoint - Fix @Schema description on DatasetItemBatch.projectId (was "dataset_name must be provided", now "project_name must be provided") - Use DatasetItemBatch builder instead of positional constructor in DatasetExportJobSubscriberResourceTest * Revision 3: Fix insertInvalidDatasetItemWorkspace test failure Use DatasetResourceClient helpers and null out datasetId to avoid PODAM generating random UUIDs that cause 404s in DatasetItemService resolution. * Revision 4: Simplify resolveProjectId in OptimizationService Inline context accesses inside fromCallable lambda, consistent with DatasetItemService.resolveProjectId pattern. * Revision 5: Support projectId on Optimization write + validate on upsert Remove READ_ONLY from Optimization.projectId so callers can pass it directly. Add a resolveProjectId branch that validates the provided projectId exists in the workspace before using it, mirroring the DatasetItemBatch projectId/projectName duality. * Revision 6: Clarify projectName/projectId as optional in DatasetItemBatch schema Both fields are optional (both null = no project scoping). Update @Schema descriptions to remove misleading "must be provided" language and describe precedence rules instead. * Revision 7: Extract resolveProjectIdOrCreate into ProjectService Both OptimizationService and ExperimentService had identical inline logic for resolving a project from (projectId, projectName): validate the id if provided, getOrCreate from the name otherwise, return empty if neither. The shared helper lives in ProjectService.resolveProjectIdOrCreate and uses deferContextual so callers no longer need to extract workspaceId/userName themselves. ExperimentService's logic is also aligned to projectId-first priority, consistent with OptimizationService and DatasetItemService. * Revision 8: Add OpenAPI schema descriptions for Optimization project_id/project_name Matches the existing descriptions on Experiment, making the auto-create and precedence semantics visible in the generated API docs. * [OPIK-4938] [BE] Fix trailing blank line in migration 000074 * [OPIK-4938] [BE] Minor test and code cleanup from PR review feedback - Merge duplicate test pairs in DatasetsResourceTest: tests that checked deprecated behavior and response headers now combined into single tests - Add callRetrievePromptVersion helper to PromptResourceClient and use it in PromptResourceTest instead of raw HTTP calls - Fix StringUtils.isNotEmpty → isNotBlank in OptimizationDAO ClickHouse row mapper for proper null/whitespace handling of project_id field * [OPIK-4938] [BE] Remove unused import and use factory directly in DatasetsResourceTest * [OPIK-4938] [BE] Use DatasetResourceClient factory methods instead of direct PODAM calls * [OPIK-4938] [BE] Restore separate deprecation header tests in DatasetsResourceTest * [OPIK-4938] [BE] Fix incorrect null assertion in twoSuiteExperiments test computeRunSummaries populates summaries for any experiment that has assertion results, regardless of group size. passThreshold defaults to 1 when no executionPolicy is set, so single-run experiments get independent PASSED/FAILED summaries. * Refactor project ID string check in OptimizationDAO * [OPIK-4938] [BE] Restore FindProjectDatasets tests accidentally deleted in refactoring Restores 6 integration tests for GET /v1/private/projects/{projectId}/datasets that were accidentally deleted in a prior sed-based refactoring commit. The tests cover: basic pagination, page size limiting, default sort by created date, sorting by all valid fields (parameterized), filtering (parameterized), and case-insensitive name search. Also promotes getDatasets__whenFetchingAllDatasets__thenReturnDatasetsSortedByByValidFields and getValidFilters to static in FindDatasets so they can be shared as external @MethodSource providers. --------- Co-authored-by: Andres Cruz <andresc@comet.com>

…onfiguration (#5782) * [OPIK-5189] [FE] fix: render ChatPrompt messages in optimizer trial configuration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address baz review — extract shared helper, segment-aware key matching, deduplicate isMessagesArray Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: scope optimizer meta filtering to hasStructuredPrompt Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: apply ChatPrompt rendering fixes to V2 components Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

) Co-authored-by: Andres Cruz <andresc@comet.com>

…d cleaner structure (#5800) * OPIK-4897 reset to global execution policy * [OPIK-5032] [FE] fix: eval suite experiment export with assertions and cleaner structure * [OPIK-5032] [FE] fix: prettier lint and policy override logic - Remove extra parentheses in DatasetItemsActionsPanel (prettier) - Simplify policyChanged to `policy != null` so disabling global policy always persists the override, even when values match defaults

…ing (#5802) Implements the v2 sidebar with project-first navigation as part of the IA Revamp (OPIK-4617). Route tree: - All feature routes moved under /$ws/projects/$projectId/... - V1 compat splat redirects catch old workspace-level URLs - SDK redirects (RedirectProjects, RedirectDatasets) unchanged Active project: - activeProjectId stored in AppStore (single source of truth) - useActiveProject hook syncs localStorage/API fallback → store - ProjectPage syncs URL $projectId → store Sidebar: - Project selector dropdown with search, edit/delete actions - Grouped menu sections: Observability, Evaluation, Prompt engineering, Optimization, Production - Workspace section at bottom with workspace selector + Dashboards + Configuration - SupportHub moved to TopBar - Insights and Agent configuration shown as disabled (no route yet) Internal links: - ~40 navigate()/Link references updated to project-scoped paths - Shared hooks (useSuiteIdFromURL, usePromptIdFromURL, etc.) duplicated in v2 with project-scoped from: paths - useNavigateToExperiment and useLoadPlayground duplicated in v2 - ResourceLink uses projectUrl field for version-aware URL resolution - matchRoute/useParams from: patterns updated for project-scoped routes Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…code-review (#5783) * [OPIK-4688] [INFRA] Auto-trigger FERN update on BE merge and notify #code-review - Trigger workflow on push to main when BE Java/OpenAPI/pom files change - Add concurrency group to prevent parallel runs - Extract merge author and originating PR for traceability - Add PR body context linking back to the triggering BE merge - Add notify-slack job that posts to #code-review via SLACK_WEBHOOK_URL_CODE_REVIEW - Include Slack user mention mapping for author tagging - Fail the workflow if Slack notification doesn't send successfully Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [OPIK-4688] [INFRA] Add contents:read permission for originating PR lookup The gh API call to find the originating PR needs contents:read permission at the job level. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [OPIK-4688] [INFRA] Add pull_request trigger for testing workflow Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [OPIK-4688] [INFRA] Remove job-level permissions that blocked branch creation The remote-branch-action and create-pull-request steps need contents:write via the default GITHUB_TOKEN. The explicit contents:read restriction was blocking the push. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [OPIK-4688] [INFRA] Temporarily allow notify-slack on pull_request for testing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [OPIK-4688] [INFRA] Add auto FERN update on BE merge with Slack notification Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Revision 2: Fix jq syntax — use quoted keys for JSON objects Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Revision 3: Fix jq — wrap blocks array in parentheses for concatenation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Revision 4: Address PR comments — refine path filter and remove pull_request trigger Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Revision 5: Extract trigger context into separate job with scoped permissions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Revision 6: Add explicit permissions per job and skip Slack post for testing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Revision 7: Remove temporary PR testing triggers and Slack skip Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

) * [OPIK-4518] [BE] Restore stable dataset item IDs in versioned API After dataset versioning was introduced, the old API endpoints started returning version-specific row IDs (changing per version) as the `id` field instead of the stable `dataset_item_id`. This broke user flows where items would get new IDs with each new version snapshot. Fix: expose `dataset_item_id AS id` in all versioned item queries so the `id` field is stable across versions. Update streaming pagination cursor to use `dataset_item_id` instead of row `id`. Remove the row-ID-to-dataset-item-ID mapping logic in the service since incoming `id` values are now already stable `dataset_item_id`s. * fix(backend): clean up review issues in stable dataset item IDs - Remove unused workspaceId param from getDatasetItemWorkspace DAO (query is intentionally unscoped for cross-workspace validation) - Fix allMatch validation gap in ExperimentItemService: verify all requested item IDs were found before checking workspace ownership - Remove identity map indirection from editItemsViaSelectInsert now that IDs are stable dataset_item_ids * fix(backend): renumber migration files to avoid conflicts with main 000062 → 000065, 000063 → 000066 (skip indexes and projection for dataset_item_id). Updated changeset IDs inside files to match. * fix(backend): clarify DatasetItemEdit.id is the stable dataset_item_id * Fix warnings on updated code * feat(backend): resolve experiment dataset_item_id in trace queries Resolve old physical row IDs to stable dataset_item_ids in the experiments_agg CTE using a targeted LEFT JOIN to dataset_item_versions. No dedup needed since id and dataset_item_id are immutable columns. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore(backend): remove manual data migration for experiment items The dual-join approach handles both old (physical row ID) and new (stable dataset_item_id) experiment items at query time, making the manual migration unnecessary. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(backend): dual-join all experiment-item queries for backward compat All queries joining experiment_items.dataset_item_id with dataset_item_versions now match on both physical row id AND stable dataset_item_id, so old experiment items (storing row IDs) and new ones (storing stable IDs) both resolve correctly without requiring a data migration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(backend): renumber migrations 000065/000066 → 000070/000071 Avoid collision with main's 000065-000069 added since last rebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(backend): renumber migrations 000070/000071 → 000073/000074 Main now has 000070-000072, so bump to next available. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(backend): use if() instead of COALESCE for FixedString LEFT JOIN ClickHouse FixedString(36) columns return null bytes (not SQL NULL) on LEFT JOIN miss, so COALESCE never falls through. Use if(div.id != '') to check whether the join actually matched. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore(backend): remove ClickHouse projection migration Defer the dataset_item_id projection to a follow-up. The skip indexes (000073) provide sufficient coverage for now. The projection can be added later if large-version performance requires it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * perf(backend): use arrayJoin to avoid duplicate CTE evaluation Consolidate OR-expanded filter conditions that referenced the same CTE twice into single arrayJoin([col1, col2]) calls. Each CTE is now evaluated exactly once per usage site. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(backend): rename skip indexes to follow naming convention Use idx_{table}_{column} pattern: idx_dataset_item_versions_dataset_item_id_bf and idx_dataset_item_versions_dataset_item_id_minmax. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(backend): restore optimized SELECT_EXPERIMENT_ITEMS_OUTPUT_COLUMNS Revert to main's simpler query that resolves trace_ids directly from experiment_items without joining dataset_item_versions. The dual-join treatment is unnecessary here — output column discovery only needs trace_ids, not dataset item mapping. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(backend): add two-level dedup to aggregated CTEs, fix naming/javadoc - Add LIMIT 1 BY dataset_item_id to dataset_items_agg_resolved, dataset_items_aggr_resolved, and ExperimentAggregatesDAO's dataset_item_versions_resolved CTEs. Without this, the OR-condition joins could match one experiment item to multiple dataset item rows from different versions, inflating groupArray results. - Remove orphaned Javadoc from deleted validateMappingsBelongToSameDataset - Rename deletedRowIds to deletedIds (now holds stable IDs, not row IDs) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(backend): fix stats query item_agg filter for dual-ID compat The dataset_item_filters in item_agg selected only physical id, missing new experiment items that store stable dataset_item_id. Use arrayJoin([id, dataset_item_id]) and add two-level dedup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(backend): renumber migration 000073 → 000074 Main added 000073_add_minmax_index_trace_threads_last_updated_at. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(backend): unify CTE column aliases to dataset_item_id AS id, id AS row_id All aggregated CTEs (dataset_items_agg_resolved, dataset_items_aggr_resolved, dataset_item_versions_resolved) now use the same convention as dataset_items_resolved: dataset_item_id AS id (stable), id AS row_id (physical). Updated all join conditions, arrayJoin filters, and GROUP BY references. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(backend): add multi-dataset rejection tests for delete and batch update Verify that delete and batch update requests with item IDs spanning multiple datasets (without explicit datasetId) return 400. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

… returns 100%) - hotfix (#5805) * [OPIK-5219] [BE] Fix pass_rate query to read from assertion_results instead of feedback_scores The GET_PASS_RATE_AGGREGATION query in ExperimentAggregatesDAO was reading from feedback_scores to determine run pass/fail, but assertion scores with category_name="suite_assertion" are routed exclusively to the assertion_results table by FeedbackScoreService. This caused every run to have no scores in feedback_scores, defaulting to "passed" and producing 100% pass rate. Replaced feedback_scores_combined/feedback_scores_final CTEs with assertion_results_final using the same ROW_NUMBER() deduplication pattern as ExperimentDAO.java. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [OPIK-5219] [BE] Fix flaky test: use shared workspace for pass rate aggregation test The test was creating a new workspace per run, but populateAggregations silently returned empty when getExperimentData couldn't find the experiment in the freshly-created workspace (ClickHouse timing). Use the static shared workspace and createExperimentItemWithData helper matching all other passing tests in this class. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

github-actions bot added dependencies Pull requests that update a dependency file java Pull requests that update Java code Frontend Backend typescript *.ts *.tsx labels Mar 16, 2026

github-actions bot assigned Nimrod007 Mar 16, 2026

baz-reviewer bot reviewed Mar 16, 2026

View reviewed changes

thiagohora and others added 25 commits March 25, 2026 10:38

[OPIK-4966] [FE] feat: v2 router, page registration, and URL redirects (

dcede61

#5765)

[OPIK-4714] [FE] Edit dataset permission (#5722)

789b9fe

* [OPIK-4714] add new permission * [OPIK-4714] add permission checks * [OPIK-4714] add checks for dataset items * [OPIK-4714] Add missing imports * [OPIK-4714] refactor * [OPIK-4714] fix after merge

[NA] [BE] Update model prices file (#5779)

c8d9681

Co-authored-by: Andres Cruz <andresc@comet.com>

Update versions to 1.10.46 and bump base version to 1.10.47

c1c59da

[NA] [SDK] [DOCS] Update automatically OpenAPI spec and Fern code (#5796

83e99c8

)

[NA] [SDK] [DOCS] Update automatically OpenAPI spec and Fern code (#5803

8ca0e4f

) Co-authored-by: Andres Cruz <andresc@comet.com>

github-actions bot added documentation Improvements or additions to documentation python Pull requests that update Python code tests Including test files, or tests related like configuration. Python SDK TypeScript SDK labels Mar 25, 2026

Conversation

Nimrod007 commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Architecture

Local development setup

Test plan

Uh oh!

github-actions bot commented Mar 16, 2026

📋 PR Linter Failed

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Unit Tests

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Integration Group 7

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Integration Group 15

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Integration Group 5

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Integration Group 11

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Integration Group 12

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Integration Group 8

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Integration Group 13

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Integration Group 16

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Integration Group 6

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Integration Group 9

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Integration Group 4

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Integration Group 10

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Integration Group 3

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Tests - Integration Group 1

Uh oh!

baz-reviewer bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

baz-reviewer bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

baz-reviewer bot Mar 16, 2026

Choose a reason for hiding this comment

Nimrod007 commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading