Skip to content

[OPIK-5242] [OPIK-5248] [FE] fix: evaluation suite save failures and Result column truncation#5844

Open
alexkuzmik wants to merge 3 commits intomainfrom
aliaksandrk/OPIK-5242-evaluation-suite-save-bugs
Open

[OPIK-5242] [OPIK-5248] [FE] fix: evaluation suite save failures and Result column truncation#5844
alexkuzmik wants to merge 3 commits intomainfrom
aliaksandrk/OPIK-5242-evaluation-suite-save-bugs

Conversation

@alexkuzmik
Copy link
Copy Markdown
Collaborator

@alexkuzmik alexkuzmik commented Mar 25, 2026

Details

Fixes three bugs in the evaluation suite UI:

OPIK-5242 Case 1 — Editing a suite created with only a name: buildPayload threw because suite.latest_version was undefined. Now detects the no-version state and performs a two-step save: creates an initial metadata-only version (evaluators + execution policy) first, then applies item changes using the returned version ID as base.

OPIK-5242 Case 2 — Creating a suite with items + assertions/policy at once: Items were uploaded before evaluation criteria, so by the time applyEvaluationCriteria ran with base_version: null, the dataset already had a version (created by the item upload), causing a backend 400 rejection. Swapped the order to match the Python SDK: apply criteria first (creates the initial version), then upload items.

OPIK-5248 — Result column truncated when all columns enabled: The pinned "Result" column in the experiment items table had no minSize, allowing it to be compressed to unreadable widths. Added minSize: 140 to match the column's default size.

Change checklist

  • User facing
  • Documentation update

Issues

  • OPIK-5242
  • OPIK-5248

AI-WATERMARK

AI-WATERMARK: yes

  • If yes:
    • Tools: Claude Code (VS Code extension)
    • Model(s): Claude Opus 4.6
    • Scope: Bug analysis, fix implementation, PR creation
    • Human verification: Code review of all changes before commit

Testing

  • TypeScript type check passes: npx tsc --noEmit — no errors
  • ESLint passes on all 4 changed files
  • Manual testing scenarios for OPIK-5242:
    1. Create evaluation suite with just a name → edit to add items/assertions/policy → Save changes → should succeed (two-step save)
    2. Create evaluation suite with name + CSV items + assertions + execution policy → should save everything (criteria applied before items)
    3. Create evaluation suite with name + assertions/policy but no CSV → should save criteria (no change from working path)
    4. Edit existing suite with version → Save changes → should work as before (single-step save, no regression)
  • Manual testing for OPIK-5248:
    1. Open experiment items tab for an evaluation suite experiment
    2. Enable all columns via the columns button
    3. Verify the "Result" column remains readable (shows "Passed"/"Failed" fully)

Documentation

N/A — Bug fixes only, no new configuration or features.

…ersion and creation flows

Fix two bugs in evaluation suite saving:

1. Editing a suite created with only a name would fail because
   buildPayload threw when latest_version was missing. Now handles
   no-version suites with a two-step save: create metadata version
   first, then apply item changes.

2. Creating a suite with items + assertions/policy failed because
   items were uploaded before criteria, causing the backend to reject
   base_version=null on a dataset that already had versions. Swapped
   the order to match the Python SDK: apply criteria first, then
   upload items.

Implements OPIK-5242: [Evaluation suite] - issues during creating evaluation suite via UI

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines 200 to +211
const handleSaveChanges = (tags?: string[], changeDescription?: string) => {
if (changesMutation.isPending) return;

if (hasNoVersion) {
// Two-step save: create initial metadata version first, then apply
// item changes on top of it. The backend requires base_version=null
// with no items for the first version.
changesMutation.mutate(
buildInitialVersionPayload({ tags, changeDescription }),
{
onSuccess: (initialVersion) => {
const itemPayload = buildPayload({
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When hasNoVersion the second buildPayload call omits tags/changeDescription — should we pass them so the final version preserves user metadata?

Finding type: Logical Bugs | Severity: 🔴 High


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In
apps/opik-frontend/src/v2/pages/EvaluationSuiteItemsPage/EvaluationSuiteItemsPage.tsx
around lines 200 to 225, the handleSaveChanges function (the hasNoVersion branch) builds
an itemPayload with buildPayload({ baseVersionOverride: initialVersion?.id }) but does
not pass the tags and changeDescription from the Add Version dialog. Refactor this by
calling buildPayload({ baseVersionOverride: initialVersion?.id, tags, changeDescription
}) so the second mutation includes the same metadata, ensuring the final version
preserves tags and changeDescription provided by the user. Keep the existing
conflict/onError handling the same.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit 4864e55 addressed this comment by passing the tags and changeDescription into the second buildPayload call in the hasNoVersion flow, so the final version preserves the metadata from the Add Version dialog and the existing conflict handling remains unchanged.

@alexkuzmik alexkuzmik added the test-environment Deploy Opik adhoc environment label Mar 25, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🔄 Test environment deployment process has started

Phase 1: Deploying base version 1.10.49 (from main branch) if environment doesn't exist
Phase 2: Building new images from PR branch aliaksandrk/OPIK-5242-evaluation-suite-save-bugs
Phase 3: Will deploy newly built version after build completes

You can monitor the progress here.

@alexkuzmik alexkuzmik removed the test-environment Deploy Opik adhoc environment label Mar 25, 2026
…dPayload call

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@alexkuzmik
Copy link
Copy Markdown
Collaborator Author

Re: baz-reviewer comment about missing tags/changeDescription in the second buildPayload call —

Fixed in 4864e55 — now passing tags and changeDescription to the second buildPayload call so the final version preserves user metadata.

🤖 Reply posted via /address-github-pr-comments

@baz-reviewer
Copy link
Copy Markdown
Contributor

baz-reviewer bot commented Mar 25, 2026

Got it, and thanks for addressing that—I'll save this to memory after the PR is merged so I can better recognize this pattern in future reviews.

@alexkuzmik alexkuzmik added the test-environment Deploy Opik adhoc environment label Mar 25, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🔄 Test environment deployment process has started

Phase 1: Deploying base version 1.10.49 (from main branch) if environment doesn't exist
Phase 2: Building new images from PR branch aliaksandrk/OPIK-5242-evaluation-suite-save-bugs
Phase 3: Will deploy newly built version after build completes

You can monitor the progress here.

@CometActions
Copy link
Copy Markdown
Collaborator

Test environment is now available!

To configure additional Environment variables for your environment, run [Deploy Opik AdHoc Environment workflow] (https://github.com/comet-ml/comet-deployment/actions/workflows/deploy_opik_adhoc_env.yaml)

Access Information

The deployment has completed successfully and the version has been verified.

…columns are enabled

Add minSize: 140 to the pinned Result column so it cannot be
compressed below its content width when all columns are visible.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@alexkuzmik alexkuzmik changed the title [OPIK-5242] [FE] fix: resolve evaluation suite save failures for no-version and creation flows [OPIK-5242] [OPIK-5248] [FE] fix: evaluation suite save failures and Result column truncation Mar 25, 2026
@alexkuzmik alexkuzmik added test-environment Deploy Opik adhoc environment and removed test-environment Deploy Opik adhoc environment labels Mar 25, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🔄 Test environment deployment process has started

Phase 1: Deploying base version 1.10.49 (from main branch) if environment doesn't exist
Phase 2: Building new images from PR branch aliaksandrk/OPIK-5242-evaluation-suite-save-bugs
Phase 3: Will deploy newly built version after build completes

You can monitor the progress here.

@comet-ml comet-ml deleted a comment from github-actions bot Mar 25, 2026
@comet-ml comet-ml deleted a comment from github-actions bot Mar 25, 2026
@CometActions
Copy link
Copy Markdown
Collaborator

Test environment deployment failed

The deployment encountered an error. Please check the deployment logs for details.

@CometActions
Copy link
Copy Markdown
Collaborator

Test environment deployment failed

The deployment encountered an error. Please check the deployment logs for details.

@CometActions
Copy link
Copy Markdown
Collaborator

Test environment is now available!

To configure additional Environment variables for your environment, run [Deploy Opik AdHoc Environment workflow] (https://github.com/comet-ml/comet-deployment/actions/workflows/deploy_opik_adhoc_env.yaml)

Access Information

The deployment has completed successfully and the version has been verified.

@alexkuzmik alexkuzmik marked this pull request as ready for review March 25, 2026 13:55
@alexkuzmik alexkuzmik requested a review from a team as a code owner March 25, 2026 13:55
@CometActions
Copy link
Copy Markdown
Collaborator

🌙 Nightly cleanup: The test environment for this PR (pr-5844) has been cleaned up to free cluster resources. PVCs are preserved — re-deploy to restore the environment.

@CometActions CometActions removed the test-environment Deploy Opik adhoc environment label Mar 26, 2026
Copy link
Copy Markdown
Contributor

@awkoy awkoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@CometActions
Copy link
Copy Markdown
Collaborator

🌙 Nightly cleanup: The test environment for this PR (pr-5844) has been cleaned up to free cluster resources. PVCs are preserved — re-deploy to restore the environment.

2 similar comments
@CometActions
Copy link
Copy Markdown
Collaborator

🌙 Nightly cleanup: The test environment for this PR (pr-5844) has been cleaned up to free cluster resources. PVCs are preserved — re-deploy to restore the environment.

@CometActions
Copy link
Copy Markdown
Collaborator

🌙 Nightly cleanup: The test environment for this PR (pr-5844) has been cleaned up to free cluster resources. PVCs are preserved — re-deploy to restore the environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants