Skip to content

[NA] [SDK] CLI import/export improvements#5616

Draft
dsblank wants to merge 28 commits intomainfrom
feature/cli-import-export-improvements
Draft

[NA] [SDK] CLI import/export improvements#5616
dsblank wants to merge 28 commits intomainfrom
feature/cli-import-export-improvements

Conversation

@dsblank
Copy link
Copy Markdown
Contributor

@dsblank dsblank commented Mar 11, 2026

Details

This PR improves the opik import and opik export CLI commands across several dimensions: new bulk commands, fidelity of round-trips, correctness fixes, a major performance improvement for large re-exports, and OQL trace filtering across all export commands.

New: --filter on opik export experiment and opik export all

The OQL filter flag (previously only on opik export project) now works on all commands that export traces.

# Export project traces after a given date (existing)
opik export my-workspace project "my-project" --filter 'created_at >= "2024-01-01T00:00:00Z"'

# Export experiment traces after a given date (new)
opik export my-workspace experiment "my-experiment" --filter 'created_at >= "2024-01-01T00:00:00Z"'

# Export all traces in a date range across the whole workspace (new)
opik export my-workspace all --filter 'created_at >= "2024-01-01T00:00:00Z" AND created_at < "2024-04-01T00:00:00Z"'

For project exports the filter is evaluated server-side (unchanged). For experiment and all (experiment phase), traces are fetched by pre-known IDs so filtering is applied client-side after each trace is retrieved, using a new matches_trace_filter() helper in utils.py that evaluates OQL expressions against trace dicts (handles date_time, number, and string fields, including nested key access).

New: opik export <workspace> all and opik import <workspace> all

Both CLI groups now have an all subcommand that operates on every data type in a workspace (datasets, prompts, projects/traces, and experiments) in a single command.

# Export everything
opik export my-workspace all

# Export only selected types
opik export my-workspace all --include datasets,prompts

# Import everything (resumes automatically if interrupted)
opik import my-workspace all

# Dry-run to preview what would be imported
opik import my-workspace all --dry-run

New: Resumable, incremental project exports (ExportManifest)

Re-running opik export … project on a large project previously required paginating through every trace via the API. A new ExportManifest (SQLite/WAL, stored as export_manifest.db alongside the trace files) tracks download state across runs:

Scenario Behaviour
Second run, nothing new Adds a created_at >= last_exported_at − 5 min filter; fetches 0–1 API pages
Aborted / interrupted run Status remains in_progress; next run loads already-downloaded IDs from the DB and skips them
No manifest yet Scans the directory once and seeds the manifest so old files are not re-downloaded
--force Resets the manifest and re-downloads everything
Format switch (json ↔ csv) Detects mismatch, resets manifest, re-downloads in the new format

Fix: PromptType case sensitivity during import

Exported prompt types are uppercased ("MUSTACHE", "JINJA2") but the PromptType enum uses lowercase values. The importer now tries the lowercased value first and falls back gracefully.

Fix: Thread-safe project name cache in experiment export

Fixed a race condition in export_traces_by_ids where two threads racing on the same project_id cache miss could both issue a redundant get_project API call. Fixed with a threading.Lock + dict.setdefault.

Fix: Retry on 429 and honour Retry-After header

429 Too Many Requests is now retried; a Retry-After header causes the client to wait exactly that long before retrying.

Improvement: Preserve read-only fields across import/export round-trips

Fields such as created_at, created_by, last_updated_at, and last_updated_by are stored under _import_* keys in metadata so the information survives a round-trip.

Issues

N/A

Testing

  • opik export <workspace> all exports datasets, prompts, projects, and experiments to the expected directory layout
  • opik import <workspace> all imports the exported tree back; verify counts match
  • Re-running opik export … project on a project with no new traces completes in seconds and reports all traces skipped
  • Killing an export mid-run and re-running resumes from where it left off without re-downloading completed traces
  • Importing a prompt exported as type "MUSTACHE" succeeds without a warning
  • A 429 response from the API is retried; a Retry-After: 5 header causes a ~5 s wait before the retry
  • opik export … experiment --filter 'created_at >= "2024-01-01T00:00:00Z"' exports only matching traces
  • opik export … all --filter 'created_at >= "2024-01-01T00:00:00Z"' filters traces in both projects and experiments phases

Documentation

  • Updated import_export_commands.mdx: documents --filter on experiment and all, adds opik export all options, adds date-range filter examples

Change checklist

  • I have considered the possibility of backwards incompatible changes and mitigated them if necessary
  • I have added tests where required
  • All pre-existing tests pass

@github-actions github-actions bot added python Pull requests that update Python code Python SDK labels Mar 11, 2026
@dsblank dsblank changed the title [SDK] CLI import/export improvements [NA] [SDK] CLI import/export improvements Mar 11, 2026
@comet-ml comet-ml deleted a comment from github-actions bot Mar 11, 2026
@comet-ml comet-ml deleted a comment from github-actions bot Mar 11, 2026
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Mar 11, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 11, 2026

🌿 Preview your docs: https://opik-preview-71792b5e-9ac1-41a2-b203-07856414d62b.docs.buildwithfern.com/docs/opik

No broken links found


📌 Results for commit 11ef58c

dsblank pushed a commit that referenced this pull request Mar 11, 2026
- Wrap _fetch_experiments_page_raw HTTP call in try/except for
  httpx.ConnectError, httpx.TimeoutException, and HTTPStatusError;
  return empty page on transient errors instead of propagating the
  exception as a hard failure.

- Verify trace file exists on disk before treating a manifest entry as
  already-downloaded; stale entries (file deleted after manifest was
  written) now trigger a re-download instead of being silently skipped.

- Track had_errors in export_traces (API error break + write exceptions)
  and only call manifest.complete() when had_errors is False, so
  interrupted exports stay in_progress and resume correctly.

- Gate the export_experiment_by_id fast path on the experiment JSON file
  actually existing on disk; reset manifest and fall through to a full
  re-export when the file was deleted.

- Fall back to _scan_downloaded_trace_ids() in export_traces_by_ids when
  the manifest exists but load_downloaded_set() returns an empty set, so
  pre-existing trace files are detected before any API calls are made.

- Extract duplicate _validate_include logic into cli/utils.py
  validate_include(); both exports/all.py and imports/all.py now
  delegate to the shared helper.

- Extract duplicate "cap → print → export_traces_by_ids" sequence into
  _export_collected_trace_ids() in experiment.py; reuse it in all.py.

- Update test_cli_changes.py for the new (exported, skipped, had_errors)
  return signature of export_traces.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added the tests Including test files, or tests related like configuration. label Mar 11, 2026
@github-actions
Copy link
Copy Markdown
Contributor

SDK E2E Tests Results

0 tests   0 ✅  0s ⏱️
0 suites  0 💤
0 files    0 ❌

Results for commit 27465ac.

dsblank pushed a commit that referenced this pull request Mar 11, 2026
- Move matches_trace_filter to exports/trace_filter.py (avoid catch-all utils)
- Log warning on ValueError instead of silently returning True in matches_trace_filter
- Fix double-counting filtered traces in export_traces_by_ids progress bar
- Rename _export_collected_trace_ids → export_collected_trace_ids (public API)
- Move validate_include to cli/include_validation.py (avoid catch-all utils.py)
- Add threading.Semaphore backpressure to _export_all_experiments thread pool
- Extract extract_trace_id_from_filename helper; remove duplicated stem logic
- Rename TestBuildImportMetadata tests to test__case__expected convention
- Add tests for matches_trace_filter (trace_filter.py)
- Add tests for project inference from trace filenames in import

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 12, 2026

Python SDK Unit Tests Results (Python 3.11)

2 319 tests   2 319 ✅  46s ⏱️
    1 suites      0 💤
    1 files        0 ❌

Results for commit 6756408.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 12, 2026

Python SDK Unit Tests Results (Python 3.13)

2 319 tests   2 319 ✅  51s ⏱️
    1 suites      0 💤
    1 files        0 ❌

Results for commit 6756408.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 12, 2026

Python SDK Unit Tests Results (Python 3.12)

2 319 tests   2 319 ✅  51s ⏱️
    1 suites      0 💤
    1 files        0 ❌

Results for commit 6756408.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 12, 2026

Python SDK Unit Tests Results (Python 3.10)

2 319 tests   2 319 ✅  51s ⏱️
    1 suites      0 💤
    1 files        0 ❌

Results for commit 6756408.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 12, 2026

Python SDK Unit Tests Results (Python 3.14)

2 319 tests   2 319 ✅  36s ⏱️
    1 suites      0 💤
    1 files        0 ❌

Results for commit 6756408.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 12, 2026

Python SDK E2E Tests Results (Python 3.13)

244 tests   242 ✅  9m 0s ⏱️
  1 suites    2 💤
  1 files      0 ❌

Results for commit 6756408.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 12, 2026

Python SDK E2E Tests Results (Python 3.12)

244 tests   242 ✅  9m 54s ⏱️
  1 suites    2 💤
  1 files      0 ❌

Results for commit 6756408.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 12, 2026

Python SDK E2E Tests Results (Python 3.11)

244 tests   242 ✅  8m 50s ⏱️
  1 suites    2 💤
  1 files      0 ❌

Results for commit 6756408.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 12, 2026

Python SDK E2E Tests Results (Python 3.14)

244 tests   242 ✅  9m 3s ⏱️
  1 suites    2 💤
  1 files      0 ❌

Results for commit 6756408.

♻️ This comment has been updated with latest results.

Douglas Blank and others added 25 commits March 24, 2026 08:02
Exported prompt types are uppercased (e.g. "MUSTACHE") but the PromptType
enum uses lowercase values. Try lowercased value first, fall back to the
original, then default to MUSTACHE on unknown values.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two concurrent threads could race on the same project_id cache miss and
both issue redundant API calls. Adds a threading.Lock with setdefault so
the first writer wins. Also accepts a pre-fetched experiment_obj to skip
the redundant get_experiment_by_id call when the caller already has it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces ExportManifest (export_manifest.db, SQLite/WAL) alongside each
project's trace directory to track download state across runs:

- Completed run: uses a created_at >= last_exported_at-5min filter so
  subsequent runs only fetch traces newer than the last export, reducing
  API pages from O(total traces) to O(new traces).
- Aborted run (status=in_progress): loads the already-downloaded set from
  the DB and resumes without re-downloading completed traces.
- No manifest yet: seeds from existing files on disk so pre-manifest
  downloads are not repeated.
- --force: resets the manifest and re-downloads everything.
- Format change (json<->csv): resets the manifest with a warning.

Also replaces per-trace Path.exists() calls with a single pre-scanned set
loaded once before the pagination loop.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds 429 (Too Many Requests) to the retryable status codes and introduces
a custom wait function that reads the server's Retry-After header (capped
at 60 s) before falling back to exponential backoff.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
prompt_obj was inferred as ChatPrompt | None from the first branch,
causing a type mismatch when the fallback assigned Prompt | None.
Explicitly annotate as Optional[Union[Prompt, ChatPrompt]].

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase 1 – filesystem scan: before submitting any work to the thread pool,
scan projects/*/trace_*.{json,csv} to build a set of already-downloaded
trace IDs and pre-filter the pending list.  Traces whose files exist never
trigger a get_trace_content() + search_spans() round-trip.

Phase 2 – per-experiment manifest: ExportManifest (filename param added)
is created per experiment under experiments/manifest_<id>.db.  After the
first successful export the full trace-ID list and downloaded set are
persisted.  On re-run the manifest short-circuits get_items() pagination
entirely and filters the pending list from the DB instead of the filesystem.
manifest.complete() is only written when failed_count == 0 so interrupted
runs resume correctly.

ExportManifest gains store_all_trace_ids() / get_all_trace_ids() and an
optional filename constructor param to support per-experiment manifests
alongside the existing per-project ones.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
export_experiment_by_id returns a 3-tuple (stats, file_written, manifest)
but _export_all_experiments was only unpacking 2 values, causing the
"too many values to unpack" error on opik export all.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extends OQL trace filtering (already available on `opik export project`)
to `opik export experiment` and `opik export all`.

For experiment exports, filtering is applied client-side after each trace
is fetched by ID, using a new `matches_trace_filter()` helper in utils.py
that evaluates OQL expressions against a trace dict (handles date_time,
number, and string fields including nested key access).

For `all` exports, the filter is forwarded to both the projects phase
(server-side, unchanged) and the experiments phase (client-side via the
same helper).

Examples:
  opik export ws experiment "my-exp" --filter 'created_at >= "2024-01-01T00:00:00Z"'
  opik export ws all --filter 'created_at >= "2024-01-01T00:00:00Z"'

Also updates import_export_commands.mdx to document the broader --filter
support and the `opik export all` command options.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Wrap _fetch_experiments_page_raw HTTP call in try/except for
  httpx.ConnectError, httpx.TimeoutException, and HTTPStatusError;
  return empty page on transient errors instead of propagating the
  exception as a hard failure.

- Verify trace file exists on disk before treating a manifest entry as
  already-downloaded; stale entries (file deleted after manifest was
  written) now trigger a re-download instead of being silently skipped.

- Track had_errors in export_traces (API error break + write exceptions)
  and only call manifest.complete() when had_errors is False, so
  interrupted exports stay in_progress and resume correctly.

- Gate the export_experiment_by_id fast path on the experiment JSON file
  actually existing on disk; reset manifest and fall through to a full
  re-export when the file was deleted.

- Fall back to _scan_downloaded_trace_ids() in export_traces_by_ids when
  the manifest exists but load_downloaded_set() returns an empty set, so
  pre-existing trace files are detected before any API calls are made.

- Extract duplicate _validate_include logic into cli/utils.py
  validate_include(); both exports/all.py and imports/all.py now
  delegate to the shared helper.

- Extract duplicate "cap → print → export_traces_by_ids" sequence into
  _export_collected_trace_ids() in experiment.py; reuse it in all.py.

- Update test_cli_changes.py for the new (exported, skipped, had_errors)
  return signature of export_traces.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two improvements to avoid slow full-page scans on resume/re-run:

1. Fast-path early exit in export_traces: after fetching page 1, compare
   the API's total trace count against len(already_downloaded).  If the
   API reports no more traces than we have locally, break immediately
   (1 API call) instead of scanning all remaining pages (O(total/100) calls).

2. After seeding a brand-new manifest from existing filesystem files, mark
   it completed immediately (using the newest file's mtime as the cutoff)
   so the incremental created_at filter kicks in for the current run, not
   just the next one.

Also fix the "No traces found" message to only print when both
exported_count and skipped_count are zero.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…est fast-path

- Replace sequential experiment loop in _export_all_experiments() with
  ThreadPoolExecutor(max_workers=10), matching how projects are exported.
  Trace IDs are collected from manifests in the main thread after each
  future completes, avoiding shared-set race conditions.
- Broaden the manifest fast-path in export_experiment_by_id() to fire
  whenever stored trace IDs exist + experiment file is on disk, not only
  when manifest status=completed. This also skips API calls for in_progress
  manifests (crashed runs that already fetched items).
- Set check_same_thread=False on ExportManifest SQLite connections so the
  main thread can safely read a manifest created by a worker thread after
  the worker has finished.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Parallel span fetches were hitting the server's rate limiter with no
retry logic, causing traces to be silently dropped. Now _fetch_spans
retries up to 5 times with exponential back-off (2s base) on HTTP 429,
and MAX_WORKERS is reduced from 20 to 10 to lower the burst rate.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
After importing 100k+ traces, the importer was silently stalling for
minutes opening and JSON-parsing every trace file just to extract the
trace ID for a project-name lookup. The ID is already encoded in the
filename (trace_<id>.json), so parse it from the stem instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Move matches_trace_filter to exports/trace_filter.py (avoid catch-all utils)
- Log warning on ValueError instead of silently returning True in matches_trace_filter
- Fix double-counting filtered traces in export_traces_by_ids progress bar
- Rename _export_collected_trace_ids → export_collected_trace_ids (public API)
- Move validate_include to cli/include_validation.py (avoid catch-all utils.py)
- Add threading.Semaphore backpressure to _export_all_experiments thread pool
- Extract extract_trace_id_from_filename helper; remove duplicated stem logic
- Rename TestBuildImportMetadata tests to test__case__expected convention
- Add tests for matches_trace_filter (trace_filter.py)
- Add tests for project inference from trace filenames in import

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
57 tests covering ExportManifest/MigrationManifest lifecycle (not_started
→ in_progress → completed → reset), trace/file tracking with batch flush,
import_all manifest resume/incremental/force/dry-run behavior, export_all
phase filtering and stats aggregation, and _paginate pagination logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…snippets

PR review requested one-line intent/trigger context and either a canonical
source link or a hand-maintenance note for each new snippet block. Split the
monolithic export/import Examples code blocks into grouped sub-sections, each
preceded by a sentence explaining when to reach for that command. Added MDX
comment blocks recording that the snippets are hand-maintained, pointing to
`sdks/python/src/opik/cli/` as the canonical source, and noting to re-verify
against `--help` when CLI options change.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds 16 new tests covering number comparisons, is_empty/is_not_empty,
not_contains/starts_with/ends_with, dotted and key-based field access,
missing field → False, AND semantics across multiple expressions, naive
datetime treated as UTC, and unparseable date_time value → False.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…etry

Remove the bespoke 429-only sleep/retry loop and dead `raise RuntimeError("unreachable")` in favour of the shared `@opik_rest_retry` tenacity decorator, which also handles 5xx and network errors, honours Retry-After headers, and re-raises the original exception on exhaustion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Clarify that a ValueError from OpikQueryLanguage causes the function to
return True (keep the trace) rather than False, explain why (avoid silent
data loss), and note that a warning is logged so callers can detect it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Deadlock: _export_all_experiments used a semaphore to cap concurrent
submissions, but released it in the as_completed loop (Loop 2) which
only runs after the submission loop (Loop 1) finishes.  With more than
max_workers*2 (20) experiments the main thread blocks on acquire() in
Loop 1 forever, since Loop 2 never starts.  Fixed by attaching a
done_callback to each future so the slot is released as soon as the
worker finishes, decoupling release from the collection loop.

Secondary fast-path: when an experiment JSON file already exists but
the manifest has no stored all_trace_ids (exported before the per-
experiment manifest feature, or an interrupted run), the export now
reads trace IDs directly from the JSON rather than calling
get_items() via the API.  On the next run all previously-seen
experiments will skip get_items() entirely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Use module-style import for retry_decorator in project.py (style guide)
- Narrow bare `except Exception` to `(OSError, json.JSONDecodeError)` in
  the JSON fast-path fallback so KeyboardInterrupt/SystemExit propagate
- Add module-level `import json` and remove redundant inline import
- Add unit tests for the JSON-based fast path in export_experiment_by_id:
  verifies get_items() is skipped and trace IDs are read from disk, and
  that a corrupt JSON file falls through to the full API path
- Clarify the semaphore done-callback comment in all.py (remove "Loop 2"
  reference, name the as_completed loop explicitly)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… test

- experiment.py: upgrade JSON fast-path fallback from silent debug_print
  to always-visible console warning with stack trace in --debug mode
  (thread 2924741570)
- test_cli_changes.py: add TestExportAllExperimentsSemaphore unit test
  that submits N > max_workers*2 experiments to verify the done_callback
  releases the semaphore and prevents deadlock (thread 2924741577)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Extract _fetch_traces_page() with @opik_rest_retry so pagination
  retries on 429/5xx instead of swallowing errors and breaking the loop
- Extract _validate_filter_syntax() + _print_oql_examples() to replace
  duplicate ~15-line validation blocks in export_traces and
  export_project_by_name
- Fix missing had_errors = True in the span-fetch future exception
  handler, which previously let the manifest be marked completed even
  when some traces failed to download

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lready-imported datasets

When running `opik import all`, Phase 1 imports datasets and the manifest
marks them completed. Phase 4 (experiments) calls the same importer and
the manifest skips them (datasets_skipped=1, datasets=0). The warning
"No datasets were imported" fired on datasets==0 without checking skipped,
producing a misleading message even though the dataset was already present.

Now the warning only fires when both imported==0 and skipped==0. The
status message also distinguishes between fresh imports and skipped ones.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Guard both _validate_filter_syntax calls with filter_string.strip()
so that a whitespace-only --filter argument is treated as a no-op
instead of crashing the OQL parser with an IndexError.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dsblank dsblank force-pushed the feature/cli-import-export-improvements branch from 6756408 to 9716ab3 Compare March 24, 2026 12:02
Comment on lines +81 to +85
def _open(self) -> sqlite3.Connection:
self.project_dir.mkdir(parents=True, exist_ok=True)
conn = sqlite3.connect(
str(self.manifest_file),
isolation_level=None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExportManifest._open duplicates MigrationManifest._open's WAL/synchronous setup and schema creation, should we extract the shared manifest plumbing into a base class or helper?

Finding type: Code Dedup and Conventions | Severity: 🟢 Low


Want Baz to fix this for you? Activate Fixer

Comment on lines +222 to +233
# Fast-path: on the first page, compare the API's total trace count
# against how many we already have. If the API reports no more traces
# than we've already downloaded, every trace is present locally and we
# can skip scanning the remaining pages entirely.
if current_page == 1 and not force and already_downloaded:
api_total = getattr(trace_page, "total", None)
if api_total is not None and api_total <= len(already_downloaded):
skipped_count = api_total
if debug:
debug_print(
f"DEBUG: Fast-path exit — API total={api_total}, "
f"already_downloaded={len(already_downloaded)}; "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make the fast-path guard compare trace_page.total only against downloads matching the current filter or skip the fast-path on filter change instead of comparing to project-wide len(already_downloaded)?

Finding type: Logical Bugs | Severity: 🔴 High


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In
sdks/python/src/opik/cli/exports/project.py around lines 222 to 237, the fast-path in
export_traces incorrectly compares trace_page.total against a project-wide
already_downloaded set even when a new filter is applied. Change the guard so the
fast-path is only executed when the current parsed_filter is None OR when a manifest is
provided and the manifest explicitly records the same filter (or filter fingerprint) as
the current parsed_filter; otherwise skip the fast-path. Concretely, check manifest is
not None and manifest.get_filter() == parsed_filter before using api_total <=
len(already_downloaded); if that condition fails, do not break early and continue normal
pagination. This avoids skipping work when the filter changed.

Comment on lines +288 to +299
with ThreadPoolExecutor(max_workers=max_workers) as executor:
future_to_project = {
executor.submit(
export_single_project,
client,
project,
projects_dir,
filter_string,
max_results,
force,
debug,
format,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ThreadPoolExecutor(max_workers=max_workers) is fed the entire all_projects list at once, should we limit pending submissions by wrapping submit in a bounded semaphore or submitting in small batches?

Finding type: Prefer non-blocking concurrency | Severity: 🔴 High


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In
sdks/python/src/opik/cli/exports/all.py around lines 288 to 305, the code in
_export_all_projects currently submits every project to ThreadPoolExecutor via a dict
comprehension which queues a future for every project at once (unbounded work queue).
Refactor by replacing the comprehension with an explicit submission loop that uses a
bounded threading.Semaphore (e.g. semaphore = threading.Semaphore(max_workers * 2)),
call semaphore.acquire() before each executor.submit(), add a done callback on the
future to semaphore.release(), and store future->project in the mapping as you go. This
will apply backpressure and prevent unbounded queued tasks; keep the existing
as_completed consumption logic unchanged.

Comment on lines +304 to +315
for future in as_completed(future_to_project):
project = future_to_project[future]
try:
proj_count, t_exported, t_skipped = future.result()
projects_exported += proj_count
traces_exported += t_exported
traces_skipped += t_skipped
except Exception as e:
console.print(
f"[red]Error exporting project '{project.name}': {e}[/red]"
)
finally:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_export_all_projects and _export_all_experiments swallow exceptions from future.result() and only print, allowing opik export … all to exit 0 despite failures — should we re-raise or let errors bubble so the CLI fails fast?

Finding type: Centralize error handling | Severity: 🔴 High


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In
sdks/python/src/opik/cli/exports/all.py around lines 304 to 321, the
_export_all_projects function currently swallows exceptions from future.result() — it
only logs errors and continues, which allows the CLI to exit 0 even when exports failed.
Refactor by introducing a local had_errors boolean (or capture the first exception)
outside the as_completed loop, set it to True (or store the exception) inside the except
Exception block, and after the as_completed loop but before returning, raise a
RuntimeError (or re-raise the captured exception) if had_errors is True so the CLI fails
non‑zero on worker failures. Apply the same pattern to the _export_all_experiments
export loop (the analogous try/except around future.result()), ensuring both functions
propagate fatal worker errors instead of silently logging them.

@dsblank dsblank marked this pull request as draft March 24, 2026 12:13
…es_by_ids

Filtered traces (dropped by client-side matches_trace_filter) were counted
as fetch failures in batch_fetch_failures, inflating failed_count and
preventing manifest.complete() from being called even when there were no
actual errors.

Track batch_filter_skipped separately and exclude those from the failure
count so manifests are correctly completed after filtered exports.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment on lines +324 to +327
batch_filter_skipped = 0
for fetch_future in as_completed(fetch_futures):
trace_id = fetch_futures[fetch_future]
try:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

batch_fetch_failures treats _fetch_trace_data None results as failures and blocks manifest.complete(), should we treat those None results as skips and exclude them from failed_count?

Finding type: Logical Bugs | Severity: 🔴 High


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In
sdks/python/src/opik/cli/exports/experiment.py around lines 324-351, inside
export_traces_by_ids, the code treats _fetch_trace_data returning None as a fetch
failure and later includes those in failed_count, preventing manifest.complete() from
running. Change the fetch-collection logic so that when a fetch future returns None it
increments a local 'batch_filter_skipped' (or similar) and is NOT added to
fetched_traces and NOT counted as a fetch failure; compute batch_fetch_failures as
(len(batch_trace_ids) - len(fetched_traces) - batch_filter_skipped) so only true fetch
exceptions are counted, and ensure failed_count is increased only by those real fetch
exceptions. This will allow manifest.complete() to run when only
project-metadata-missing traces were skipped.

Comment on lines +344 to +345
f"[red]Error fetching trace {trace_id}: {e}[/red]"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fetch_future.result() is caught and only printed under debug — should we log or rethrow so failures aren't silently swallowed?

Finding type: Centralize error handling | Severity: 🟠 Medium


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In
sdks/python/src/opik/cli/exports/experiment.py around lines 341 to 345 inside the
export_traces_by_ids function (the fetch-as_completed loop), the current except
Exception block only prints errors when debug is true, causing fetch failures to be
silently swallowed. Change this block to always log the failure at error (or warning)
level including the trace_id and exception details (use the existing console or a module
logger), and increment the outer failed_count to record a fetch failure for centralized
monitoring. Do not re-raise the exception so other fetches continue, but ensure the
message is emitted even when debug is False and includes enough context (trace_id,
exception type/message, and optionally a stacktrace) for diagnostics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation Python SDK python Pull requests that update Python code tests Including test files, or tests related like configuration.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant