feat: route prompt-template experiments through batched create [SC-65568]#622
feat: route prompt-template experiments through batched create [SC-65568]#622quinn-galileo wants to merge 2 commits into
Conversation
…568] Prompt-template experiments now create and trigger in a single createExperiment(trigger=true, scorers, promptTemplateVersionId, promptSettings) call instead of the legacy createExperiment + run-scorer-settings + explicit createPromptRunJob (POST /jobs) flow. The explicit job-submission route always used the single-job path, so JS-created experiments never entered the batched playground path even with the backend flag enabled; routing through trigger=true fixes that and matches the Python SDK. createMetricConfigs gains a resolve-only mode (nullish runId) so scorers can be resolved without server-side registration and passed in the create body. createPromptRunJob is kept but deprecated. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #622 +/- ##
==========================================
+ Coverage 80.36% 80.45% +0.08%
==========================================
Files 85 85
Lines 7502 7521 +19
Branches 2250 2293 +43
==========================================
+ Hits 6029 6051 +22
+ Misses 1462 1459 -3
Partials 11 11
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
/astra review |
There was a problem hiding this comment.
⚠️ This review was generated by an AI agent (Astra) and may contain mistakes. Please verify all suggestions independently.
Verdict: approve — Backward-compatible, type-safe refactor with sound logic and good test coverage of the behavior changes; only minor testing/maintainability nits remain.
Follow-ups
Suggested follow-up work that could be tracked as Shortcut stories:
src/api-client/services/experiment-service.ts:69-76:createExperimentnow takes 6 positional parameters including a positional booleantrigger(a classic boolean-trap; call sites read..., true, scorerConfigs, ...). Consider migrating the API-client/servicecreateExperimentto an options object (matching the entity-levelExperiments.createExperiment({...})shape) for readability and to avoid future positional-append churn. Non-blocking and out of scope for this PR.
…ion path [SC-65568] Adds the two regression tests Astra flagged as uncovered: - prompt template + LocalMetricConfig rejects with the function-only guard message - prompt-template run accepts a PromptTemplateVersion passed directly Also adds .prettierignore for the semantic-release-generated CHANGELOG.md so `pre-commit run --all-files` stops failing on a file that is never hand-formatted. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Thanks for the review. Addressed the testing nits in 514c24f (added the local-metric-guard and On the Also added |
Summary
Brings the JS SDK to parity with the Python SDK for the playground-batching work: prompt-template experiments now create and trigger in a single
createExperiment(trigger=true, …)call, so they enter the batched playground path when the backendplayground_batchingflag is enabled.Previously
runExperiment's prompt-template path used a 3-call flow ending in an explicitcreatePromptRunJob(POST /jobs). That job-submission route is untouched by the batching backend, so a JS-created experiment never entered the batched path even with the flag on (no regression — it fell to the legacy single-job path — but no batching benefit either).Changes
experiment-service.ts/galileo-client.ts:createExperimentwidened by positional append (name, dataset, trigger?, scorers?, promptTemplateVersionId?, promptSettings?) — backward-compatible for the published API; the new fields are sent only when provided.createPromptRunJobkept but marked@deprecated.utils/metrics.ts:createMetricConfigsaccepts a nullishrunId→ resolve-only mode (resolves scorer configs without registering them server-side), mirroring Python'screate_metric_configs(project_id, None, …).entities/experiments.ts:runExperimentbranches experiment creation by mode. Function path unchanged (untriggered create + scorer registration + local run). Prompt-template path resolves scorers (resolve-only) + dataset + prompt-version, then creates withtrigger=truein one call, applies tags after, builds the link/message client-side, and no longer callscreatePromptRunJob.AGENTS.md,experiments-reference.md).Behavior note
Return shape
{experiment, link, message}is unchanged. The prompt-pathmessagetext now comes from a client-built string (was the server job message) — non-breaking, but flagged for anyone string-matching it.Testing
Full Jest suite green (1750 tests); tsc + eslint clean. Updated the 8 prompt-path assertions and added resolve-only + create-body-shape tests, plus a local-metric-guard test and a
PromptTemplateVersion-path test.Local e2e — validated against a local stack in both flag states (the change is unconditional
trigger=true, so it must be correct regardless of the flag):playground_batchingplayground_runjob,trace_id_count=5,batch_idset,completedplayground_runjob,trace_id_count=0,batch_id=None,completedWith the flag off, the API routes the
trigger=truecreate to the legacy single-job path and the experiment completes normally — no regression for tenants without the flag. With it on, the run enters the batched path. Matches the Python SDK's behavior.🤖 Generated with Claude Code