[NA] [SDK] CLI import/export improvements by dsblank · Pull Request #5616 · comet-ml/opik

dsblank · 2026-03-11T12:53:52Z

Details

This PR improves the opik import and opik export CLI commands across several dimensions: new bulk commands, fidelity of round-trips, correctness fixes, a major performance improvement for large re-exports, and OQL trace filtering across all export commands.

New: `--filter` on `opik export experiment` and `opik export all`

The OQL filter flag (previously only on opik export project) now works on all commands that export traces.

# Export project traces after a given date (existing)
opik export my-workspace project "my-project" --filter 'created_at >= "2024-01-01T00:00:00Z"'

# Export experiment traces after a given date (new)
opik export my-workspace experiment "my-experiment" --filter 'created_at >= "2024-01-01T00:00:00Z"'

# Export all traces in a date range across the whole workspace (new)
opik export my-workspace all --filter 'created_at >= "2024-01-01T00:00:00Z" AND created_at < "2024-04-01T00:00:00Z"'

For project exports the filter is evaluated server-side (unchanged). For experiment and all (experiment phase), traces are fetched by pre-known IDs so filtering is applied client-side after each trace is retrieved, using a new matches_trace_filter() helper in utils.py that evaluates OQL expressions against trace dicts (handles date_time, number, and string fields, including nested key access).

New: `opik export <workspace> all` and `opik import <workspace> all`

Both CLI groups now have an all subcommand that operates on every data type in a workspace (datasets, prompts, projects/traces, and experiments) in a single command.

# Export everything
opik export my-workspace all

# Export only selected types
opik export my-workspace all --include datasets,prompts

# Import everything (resumes automatically if interrupted)
opik import my-workspace all

# Dry-run to preview what would be imported
opik import my-workspace all --dry-run

New: Resumable, incremental project exports (`ExportManifest`)

Re-running opik export … project on a large project previously required paginating through every trace via the API. A new ExportManifest (SQLite/WAL, stored as export_manifest.db alongside the trace files) tracks download state across runs:

Scenario	Behaviour
Second run, nothing new	Adds a `created_at >= last_exported_at − 5 min` filter; fetches 0–1 API pages
Aborted / interrupted run	Status remains `in_progress`; next run loads already-downloaded IDs from the DB and skips them
No manifest yet	Scans the directory once and seeds the manifest so old files are not re-downloaded
`--force`	Resets the manifest and re-downloads everything
Format switch (json ↔ csv)	Detects mismatch, resets manifest, re-downloads in the new format

Fix: PromptType case sensitivity during import

Exported prompt types are uppercased ("MUSTACHE", "JINJA2") but the PromptType enum uses lowercase values. The importer now tries the lowercased value first and falls back gracefully.

Fix: Thread-safe project name cache in experiment export

Fixed a race condition in export_traces_by_ids where two threads racing on the same project_id cache miss could both issue a redundant get_project API call. Fixed with a threading.Lock + dict.setdefault.

Fix: Retry on 429 and honour `Retry-After` header

429 Too Many Requests is now retried; a Retry-After header causes the client to wait exactly that long before retrying.

Improvement: Preserve read-only fields across import/export round-trips

Fields such as created_at, created_by, last_updated_at, and last_updated_by are stored under _import_* keys in metadata so the information survives a round-trip.

Issues

N/A

Testing

opik export <workspace> all exports datasets, prompts, projects, and experiments to the expected directory layout
opik import <workspace> all imports the exported tree back; verify counts match
Re-running opik export … project on a project with no new traces completes in seconds and reports all traces skipped
Killing an export mid-run and re-running resumes from where it left off without re-downloading completed traces
Importing a prompt exported as type "MUSTACHE" succeeds without a warning
A 429 response from the API is retried; a Retry-After: 5 header causes a ~5 s wait before the retry
opik export … experiment --filter 'created_at >= "2024-01-01T00:00:00Z"' exports only matching traces
opik export … all --filter 'created_at >= "2024-01-01T00:00:00Z"' filters traces in both projects and experiments phases

Documentation

Updated import_export_commands.mdx: documents --filter on experiment and all, adds opik export all options, adds date-range filter examples

Change checklist

I have considered the possibility of backwards incompatible changes and mitigated them if necessary
I have added tests where required
All pre-existing tests pass

sdks/python/src/opik/cli/exports/all.py

sdks/python/src/opik/cli/exports/project.py

sdks/python/src/opik/cli/exports/experiment.py

sdks/python/src/opik/cli/exports/all.py

github-actions · 2026-03-11T15:42:47Z

🌿 Preview your docs: https://opik-preview-71792b5e-9ac1-41a2-b203-07856414d62b.docs.buildwithfern.com/docs/opik

No broken links found

📌 Results for commit 11ef58c

sdks/python/src/opik/cli/exports/experiment.py

apps/opik-documentation/documentation/fern/docs/tracing/import_export_commands.mdx

sdks/python/src/opik/cli/exports/utils.py

- Wrap _fetch_experiments_page_raw HTTP call in try/except for httpx.ConnectError, httpx.TimeoutException, and HTTPStatusError; return empty page on transient errors instead of propagating the exception as a hard failure. - Verify trace file exists on disk before treating a manifest entry as already-downloaded; stale entries (file deleted after manifest was written) now trigger a re-download instead of being silently skipped. - Track had_errors in export_traces (API error break + write exceptions) and only call manifest.complete() when had_errors is False, so interrupted exports stay in_progress and resume correctly. - Gate the export_experiment_by_id fast path on the experiment JSON file actually existing on disk; reset manifest and fall through to a full re-export when the file was deleted. - Fall back to _scan_downloaded_trace_ids() in export_traces_by_ids when the manifest exists but load_downloaded_set() returns an empty set, so pre-existing trace files are detected before any API calls are made. - Extract duplicate _validate_include logic into cli/utils.py validate_include(); both exports/all.py and imports/all.py now delegate to the shared helper. - Extract duplicate "cap → print → export_traces_by_ids" sequence into _export_collected_trace_ids() in experiment.py; reuse it in all.py. - Update test_cli_changes.py for the new (exported, skipped, had_errors) return signature of export_traces. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-03-11T16:01:28Z

SDK E2E Tests Results

0 tests 0 ✅ 0s ⏱️
0 suites 0 💤
0 files 0 ❌

Results for commit 27465ac.

sdks/python/src/opik/cli/exports/experiment.py

sdks/python/tests/unit/test_cli_changes.py

sdks/python/src/opik/cli/exports/all.py

sdks/python/src/opik/cli/utils.py

sdks/python/src/opik/cli/exports/project.py

sdks/python/src/opik/cli/imports/experiment.py

- Move matches_trace_filter to exports/trace_filter.py (avoid catch-all utils) - Log warning on ValueError instead of silently returning True in matches_trace_filter - Fix double-counting filtered traces in export_traces_by_ids progress bar - Rename _export_collected_trace_ids → export_collected_trace_ids (public API) - Move validate_include to cli/include_validation.py (avoid catch-all utils.py) - Add threading.Semaphore backpressure to _export_all_experiments thread pool - Extract extract_trace_id_from_filename helper; remove duplicated stem logic - Rename TestBuildImportMetadata tests to test__case__expected convention - Add tests for matches_trace_filter (trace_filter.py) - Add tests for project inference from trace filenames in import Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

sdks/python/src/opik/cli/exports/trace_filter.py

github-actions · 2026-03-12T13:04:33Z