Phase 26: Twelve Labs Async Embedding by tazarov · Pull Request #509 · amikos-tech/chroma-go

tazarov · 2026-04-14T12:47:48Z

Summary

Phase 26: Twelve Labs Async Embedding
Goal: Twelve Labs provider handles async task responses for long-running audio and video embeddings
Status: Verified on 2026-04-14 (26-VERIFICATION.md status: passed)

This phase adds an explicit async path for Twelve Labs long-running media embeddings. Audio and video callers can now opt in with WithAsyncPolling(maxWait) to create a task, poll until it reaches a terminal state, and return the final embedding or a sanitized failure. Text/image callers and all non-opt-in traffic remain on the existing synchronous path.

Changes

Plan 26-01: Async HTTP foundation

Added dedicated async request/response types for the tasks API, decoded the _id alias correctly, preserved raw task failure bodies for later sanitization, and introduced internal polling configuration fields/defaults plus task POST/GET helpers.

Key files:

pkg/embeddings/twelvelabs/twelvelabs.go

Plan 26-02: Polling loop and modality routing

Implemented the async polling loop with capped backoff, distinct cancellation/deadline/maxWait behavior, failure-body sanitization, fused-audio rejection on the async path, and audio/video routing into the tasks flow while leaving text/image unchanged.

Key files:

pkg/embeddings/twelvelabs/twelvelabs_async.go
pkg/embeddings/twelvelabs/content.go

Plan 26-03: Public option and config round-trip

Added the public WithAsyncPolling(maxWait) option and made async config persist/rebuild through async_polling and async_max_wait_ms without changing default behavior for callers who do not opt in.

Key files:

pkg/embeddings/twelvelabs/option.go
pkg/embeddings/twelvelabs/twelvelabs.go

Plan 26-04: Async test coverage

Added 12 async-focused tests covering task creation, ready/failed/unexpected task states, context cancellation, SDK maxWait behavior, sync-path preservation for text/image, config round-trip, raw failure sanitization, blocked HTTP maxWait enforcement, and fused-mode rejection.

Key files:

pkg/embeddings/twelvelabs/twelvelabs_test.go

Requirements Addressed

TLA-01: Twelve Labs provider detects async task responses and enters a polling loop
TLA-02: Async polling respects caller context for cancellation and timeout
TLA-03: Async polling handles terminal states with appropriate result delivery or error messages
TLA-04: Tests cover async task creation, polling, completion, failure, and context cancellation

Verification

Automated verification passed (26-VERIFICATION.md, 4/4 must-haves verified on 2026-04-14)
go test -tags=ef -count=1 ./pkg/embeddings/twelvelabs/...
make lint
Human verification not required for this library change

Key Decisions

Async stays opt-in through WithAsyncPolling(maxWait); polling internals remain hidden.
Audio/video route through the tasks API; text/image always stay on sync /embed-v2.
Async audio request payloads use embedding_option as []string, and task responses decode Twelve Labs’ _id field.
Failed-task errors sanitize the authentic raw response body instead of a re-marshaled subset.
fused audio is rejected on the async path instead of being silently remapped.

Research F-01 collapsed the task API to two endpoints (POST /tasks and GET /tasks/{id}); the in-scope bullet still listed a third /status sub-path that does not exist. Aligns CONTEXT with RESEARCH and with the plans to prevent executor drift.

…ayload Adds a `FailureDetail json.RawMessage` field to TaskResponse (json:"-") and has doTaskGet populate it with the raw response bytes on each GET /tasks/{id}. This lets Plan 02's failed-status branch sanitize the authentic server-provided reason instead of re-marshaling the parsed struct subset, which was dropping undeclared reason fields (codex HIGH, D-17 compliance).

…used rejection - Per-HTTP-call deadline (codex HIGH): every doTaskGet is now wrapped in context.WithDeadline(ctx, min(parentDL, sdkMaxWaitDeadline)) so a blocked/hung HTTP request cannot outlive maxWait (D-09 hard bound). Translates deadline-source back into distinct errors — SDK maxWait vs parent ctx.DeadlineExceeded (D-20). - Fused-on-async rejection (codex/gemini HIGH): contentToAsyncRequest now rejects audioOpt == "fused" with a deterministic validation error; async endpoint accepts only "audio" and "transcription" (RESEARCH F-02 / A5). - Failed-reason authenticity (codex HIGH): failed-status branch uses resp.FailureDetail (raw server body from Plan 01) rather than re-marshaling the parsed TaskResponse (D-17). - Distinct ctx-cancel vs deadline wording (codex MEDIUM): context.Canceled wraps as "async polling canceled"; context.DeadlineExceeded wraps as "async polling deadline exceeded".

… malformed - Use the existing embeddings.ConfigInt helper (pkg/embeddings/embedding.go:674) instead of reimplementing int/int64/float64 switching (codex LOW). - Distinguish missing key from malformed value: async_polling=true AND malformed async_max_wait_ms no longer silently falls back to WithAsyncPolling(0) (which would enable async with the 30-minute default on broken config). The flag stays off rather than enabling on a broken round-trip (codex MEDIUM).

…tests - Fix compile blocker: replace all 5 embeddings.ContentPart references with embeddings.Part (codex HIGH/compile-blocker — multimodal.go:61 defines Part, there is no ContentPart in the embeddings package). - Add TestTwelveLabsAsyncFailedReasonSanitized: long-reason failed-task fixture asserts the authentic server reason reaches the error and is sanitized (ratchets D-17 via a runtime test). - Add TestTwelveLabsAsyncBlockedHTTPMaxWait: blocking httptest handler proves the per-HTTP-call deadline added in Plan 02 actually unblocks an in-flight GET when maxWait fires. - Add TestTwelveLabsAsyncFusedRejected: WithAudioEmbeddingOption("fused") + WithAsyncPolling + audio content must return a deterministic validation error before any HTTP call. - Assert APIKeyEnvVar survives the config round-trip (gemini review).

- Add AsyncEmbedV2Request with AsyncAudioInput (embedding_option as []string) and AsyncVideoInput - Add TaskCreateResponse and TaskResponse with Mongo-style `_id` JSON alias - TaskResponse.FailureDetail (json.RawMessage, excluded from JSON) preserves raw body for D-17 sanitization - Extend TwelveLabsClient with unexported async polling fields - applyDefaults sets initial=2s, multiplier=1.5, cap=60s per D-11

- doTaskPost: POST {BaseAPI}/tasks with JSON body; mirrors doPost headers and error sanitization - doTaskGet: GET {BaseAPI}/tasks/{id} with URL-escaped task ID; empty-ID guard rejects the Mongo-alias footgun - Both use chttp.ReadLimitedBody + chttp.SanitizeErrorBody (Phase 25 convention) - doTaskGet preserves raw response bytes in TaskResponse.FailureDetail (D-17) so Plan 02 can sanitize the authentic server failure reason - Lint-silenced with //nolint:unused until Plans 26-02/03 wire the polling loop and option

- expose single public async trigger (D-03, D-04) - maxWait=0 selects 30-minute default - reject negative maxWait with explicit error - sets asyncPollingEnabled=true and asyncMaxWait on client

- GetConfig emits async_polling and async_max_wait_ms only when enabled (D-21, D-22) - FromConfig reads async_polling (bool) and async_max_wait_ms via embeddings.ConfigInt - malformed async_max_wait_ms leaves async disabled (no silent 30-min fallback) - missing keys = async off (opt-in only via WithAsyncPolling, D-23)

- pollTask: capped exponential backoff polling loop using time.NewTimer (no time.After leak per Pitfall 2) with per-HTTP-call deadline bounded by min(parent ctx deadline, SDK maxWait deadline) so a blocked doTaskGet cannot outlive maxWait (D-09 hard bound). - Distinct error wording for ctx.Canceled, ctx.DeadlineExceeded, and SDK-side maxWait expiry (D-20). errors.Is still unwraps to stdlib sentinels via pkg/errors. - Terminal status=failed uses raw server body from TaskResponse.FailureDetail through chttp.SanitizeErrorBody — no re-marshaled struct subset (D-17). - D-16 default branch rejects unknown status values. - contentToAsyncRequest rejects audio embedding option 'fused' on the async path per RESEARCH F-02 / Assumption A5 (async accepts only 'audio' and 'transcription'). - createTaskAndPoll orchestrates build -> POST -> optional early-ready short-circuit -> poll -> extract embedding. - //nolint:unused annotations on the symbols until Task 2 wires routing in content.go (pattern from Plan 26-01).

- document WithAsyncPolling public surface and round-trip keys - note malformed-value → async-off decision (no silent 30-min fallback) - record reuse of embeddings.ConfigInt helper across numeric JSON types

…abled - embedContent now dispatches to createTaskAndPoll when asyncPollingEnabled is true AND the single part is audio or video (CONTEXT.md D-07). Text/image always take the sync path (D-07). - When asyncPollingEnabled is false, all four modalities continue on the sync doPost path — zero behavior change for non-opt-in callers (D-08). - Removed //nolint:unused annotations from twelvelabs_async.go — all symbols are now reachable from embedContent via the routing switch.

…ed-status tests - Add newTestAsyncEF helper with ms-scale polling intervals - Add audioContent/videoContent + taskCreateJSON/taskGetJSON fixtures - Fixtures emit _id (not id) to guard Pitfall 1 alias gotcha - Cover 4 of 6 D-26 flows: create, poll-to-ready, poll-to-failed, unexpected-status

…rip, and review-fix tests - TestTwelveLabsAsyncCtxCancel: proves errors.Is(err, context.Canceled) (D-10/D-19) - TestTwelveLabsAsyncMaxWait: proves 'async polling maxWait' distinct from DeadlineExceeded (D-20) - TestTwelveLabsAsyncSkipsTextImage: text/image never hit /tasks even when async is on (D-07) - TestTwelveLabsAsyncConfigRoundTrip: async_polling/async_max_wait_ms + APIKeyEnvVar survive round-trip (D-23) - TestTwelveLabsAsyncConfigOmitWhenDisabled: async keys omitted when disabled (D-22) - TestTwelveLabsAsyncFailedReasonSanitized: raw server reason survives sanitization (D-17) - TestTwelveLabsAsyncBlockedHTTPMaxWait: per-call deadline unblocks hanging GET within maxWait - TestTwelveLabsAsyncFusedRejected: fused+async rejected before POST /tasks (F-02)

claude · 2026-04-14T13:45:08Z

Code Review: Phase 26 — Twelve Labs Async Embedding

Bug: `context.Canceled` handling discards original error and risks nil return

In both pollTask (twelvelabs_async.go:88) and createTaskAndPoll (twelvelabs_async.go:173), the context.Canceled branch wraps ctx.Err() instead of the original transport error err:

if errors.Is(err, context.Canceled) {
    return nil, errors.Wrap(ctx.Err(), "Twelve Labs async polling canceled")
}

This is inconsistent with the context.DeadlineExceeded branch directly above, which correctly preserves the original error:

return nil, errors.Wrapf(err, "Twelve Labs async polling deadline exceeded")

Two problems:

Lost debug context: The original err contains the full transport error chain (e.g., "failed to send task retrieve request to .../tasks/abc: context canceled"). Wrapping only ctx.Err() discards this, making cancel-related failures harder to diagnose.
Potential nil error / nil-pointer panic: pkg/errors.Wrap(nil, msg) returns nil. If errors.Is(err, context.Canceled) is true but ctx.Err() is nil (possible with custom transports or derived sub-contexts), this returns (nil, nil). The caller then dereferences the nil *TaskResponse at resp.Status (pollTask) or nil *TaskCreateResponse (createTaskAndPoll), causing a panic — violating the library's no-panic contract from CLAUDE.md.

Fix: Use errors.Wrap(err, ...) instead of errors.Wrap(ctx.Err(), ...) in both locations, matching the DeadlineExceeded branch pattern.

Minor: Config round-trip truncation on 32-bit platforms

GetConfig() stores asyncMaxWait.Milliseconds() as int64, but NewTwelveLabsEmbeddingFunctionFromConfig reads it via embeddings.ConfigInt() which casts int64 → int. On 32-bit systems, maxWait values exceeding ~24.8 days would silently truncate. Not a practical concern for the 30-minute default, but worth noting if the API evolves.

Note: PR description outdated

The PR description lists a known issue: "Initial doTaskPost is not yet bounded by asyncMaxWait." However, the implementation in createTaskAndPoll (lines 149–156) already bounds it with a derived deadline context, and TestTwelveLabsAsyncMaxWaitHardBound verifies this. The description should be updated to reflect that this was resolved.

Positive observations

Consistent use of url.PathEscape(taskID) prevents path injection from server-controlled task IDs.
All response bodies read through chttp.ReadLimitedBody — no unbounded reads.
Error bodies consistently sanitized via chttp.SanitizeErrorBody.
Backoff timer correctly stopped on context cancellation (no timer leak).
FailureDetail correctly copies response bytes to avoid buffer reuse issues.
Good test coverage including edge cases (blocked HTTP, fused rejection, empty ID, early-ready).

… transport errors - .github/workflows/go.yml: pass GITHUB_TOKEN to both crosslang test steps so pure-onnx's bootstrap can use authenticated GitHub API requests. The cross-language tests auto-wire the default EF from server collection config, which triggers ONNX runtime bootstrap; anonymous GitHub API quota (60 req/hr per runner IP) was causing intermittent failures surfacing as "embedding function is required" when bootstrap failed silently during auto-wire. - twelvelabs_async.go: mirror the context.DeadlineExceeded branch and wrap the original transport error (err) in the context.Canceled branches of pollTask and createTaskAndPoll. Previous code wrapped ctx.Err() which both discarded the transport chain and risked (nil, nil) returns from pkg/errors.Wrap(nil, ...), violating the library's no-panic contract.

claude · 2026-04-14T17:53:08Z

PR Review: Phase 26 — Twelve Labs Async Embedding

Overall this is well-structured with thorough test coverage and careful deadline/cancellation handling. A few issues found:

Bug: `createTaskAndPoll` does not handle `status=failed` from task creation

In twelvelabs_async.go:181, only taskStatusReady is handled as an early return from the create response:

if created.Status == taskStatusReady && len(created.Data) > 0 {
    return buildEmbeddingFromData(created.Data)
}

If the server returns a task with status: "failed" and a non-empty _id in the create response, the code falls through to pollTask, which fires an unnecessary GET request. Worse, the failure details from the create response body are lost — doTaskPost does not capture FailureDetail like doTaskGet does, so the error message from the poll GET may differ from the original failure reason.

Suggested fix: Add a taskStatusFailed check after the ready check:

if created.Status == taskStatusFailed {
    return nil, errors.Errorf("Twelve Labs task [%s] failed at creation", created.ID)
}

Security (defense-in-depth): No URL scheme allowlist in `buildMediaSource`

content.go:128 validates that the URL has a scheme and host, which correctly rejects file://, data:, and javascript: URLs (empty host). However, schemes like ftp://, gopher://, or dict:// with a valid host would pass validation and be sent to the Twelve Labs API. If the API fetches these URLs server-side, this could enable SSRF via the client library.

Suggested fix: Add a scheme allowlist:

if !strings.EqualFold(parsed.Scheme, "http") && !strings.EqualFold(parsed.Scheme, "https") {
    return MediaSource{}, errors.Errorf("URL source scheme %q not supported; use http or https", parsed.Scheme)
}

Minor: Early-ready path silently falls through to polling when data is missing

twelvelabs_async.go:181 requires len(created.Data) > 0 alongside taskStatusReady. If the server returns status: "ready" with an empty data array, the code falls through to polling instead of returning an immediate error. This masks a malformed server response behind an extra network request that will likely produce the same result.

Test coverage is solid — 25+ async tests with good edge case coverage (blocked HTTP, parent deadline, fused rejection, config round-trip). The per-HTTP-call deadline wrapping in pollTask is particularly well done, with correct discrimination between SDK maxWait, parent deadline, and client timeout errors.

🤖 Generated with Claude Code

- createTaskAndPoll: treat status=failed on the POST /tasks response as a terminal error with a distinct message, and treat status=ready as terminal regardless of data shape (let buildEmbeddingFromData surface malformed responses). Avoids a wasteful GET round-trip in both cases. - buildMediaSource: reject URL sources whose scheme is not http or https. Defense-in-depth against sending non-web schemes to the Twelve Labs API via the client. - Add tests: - TestTwelveLabsAsyncCreateReturnsFailed pins the no-GET path for failed-on-create. - TestTwelveLabsAsyncCreateReturnsReadyEmptyData pins the no-GET path for malformed ready-with-no-data. - TestBuildMediaSourceURLValidation extends with ftp/gopher/file scheme rejection and plain http acceptance.

claude · 2026-04-14T18:03:02Z

Code Review: Phase 26 — Twelve Labs Async Embedding

Overall this is a well-structured implementation. The polling loop, deadline propagation, and error classification are thorough. A few findings below.

Bugs / Issues

1. No minimum validation on asyncMaxWait (option.go:112)

WithAsyncPolling(1 * time.Nanosecond) is accepted but will immediately time out before any HTTP request completes. Consider enforcing a minimum (e.g., 1 second) to prevent confusing instant-timeout behavior.

2. SDK maxWait timeout errors are not programmatically distinguishable (twelvelabs_async.go:79)

errors.Errorf with %v intentionally breaks the error chain so errors.Is(err, context.DeadlineExceeded) returns false (D-20). But this means callers have no way to programmatically detect an SDK maxWait timeout other than string matching. Consider defining a sentinel error (e.g., var ErrAsyncMaxWaitExceeded) that callers can check with errors.Is, while still keeping it distinct from context.DeadlineExceeded.

3. Stale PR review note

The PR body states: "Initial doTaskPost is not yet bounded by asyncMaxWait". But the code does bound it -- createTaskAndPoll wraps the create call with context.WithDeadline derived from sdkMaxWaitDeadline (lines 149-158 of twelvelabs_async.go). This note should be removed to avoid confusing future reviewers.

Minor / Informational

4. Empty failure detail produces slightly odd error message

If a server returns {"status":"failed"} with no error/message/reason fields, sanitizeTaskFailureDetail falls through to dumping the raw JSON body. The error becomes Twelve Labs task [abc] terminal status=failed: {"_id":"abc","status":"failed"}. Consider returning a generic fallback like "(no failure detail provided)" when extractTaskFailureDetail finds no structured reason and body is non-nil.

5. Config round-trip test does not cover JSON serialization

TestTwelveLabsAsyncConfigRoundTrip passes the config map directly in-memory. If the config is persisted via JSON (e.g., a registry), int64 values decode as float64. ConfigInt handles this, but the test does not exercise that path. Adding a JSON marshal/unmarshal step to the round-trip test would lock in this guarantee.

What looks good

Deadline propagation through createTaskAndPoll -> pollTask correctly shares a single maxWait budget (D-09 hard bound)
Per-HTTP-call deadline (callCtx) prevents a hung GET from outliving maxWait
Error classification (D-20) correctly distinguishes parent context deadline, SDK maxWait, HTTP client timeout, and cancellation -- with good test coverage
url.PathEscape on task IDs prevents path traversal
FailureDetail uses json:"-" with manual population to preserve the authentic server response
SanitizeErrorBody on all user-visible error paths prevents log flooding
Fused-audio rejection on the async path is deterministic and tested

No critical bugs or security issues found.

- Reject sub-floor asyncMaxWait at option-application time (< 2s can't complete one poll cycle); extract defaultAsyncPollInitial constant so applyDefaults and WithAsyncPolling share a single floor source of truth. - Introduce ErrAsyncMaxWaitExceeded sentinel wrapped via fmt.Errorf %w at all 4 maxWait-exceeded sites so callers can detect SDK-enforced timeouts via errors.Is without string matching. Chain-break from context.DeadlineExceeded preserved (D-20). - sanitizeTaskFailureDetail returns "(no failure detail provided)" when extractTaskFailureDetail yields nothing and the body is a JSON object with no diagnostic fields, avoiding leakage of housekeeping fields (_id, status) into user-facing errors. - Extend TestTwelveLabsAsyncConfigRoundTrip with json.Marshal/Unmarshal so the float64-coercion path used by registry-persisted configs is locked in.

claude · 2026-04-14T19:18:36Z

PR Review — Phase 26: Twelve Labs Async Embedding

Overall this is a well-structured PR with solid timeout/cancellation handling and good test coverage. Two issues found:

Bug: Task-creation failure discards server diagnostic detail

twelvelabs_async.go:196 — When doTaskPost returns a response with status=failed, the error message includes only the task ID:

case taskStatusFailed:
    return nil, errors.Errorf("Twelve Labs async task [%s] terminal status=failed at creation", created.ID)

Compare with the polling path (twelvelabs_async.go:107) which preserves and sanitizes the server's failure reason:

return nil, errors.Errorf("Twelve Labs task [%s] terminal status=failed: %s", taskID, sanitizeTaskFailureDetail(resp.FailureDetail))

The TaskCreateResponse struct has no FailureDetail field and doTaskPost does not capture the raw response body for failed-status responses (unlike doTaskGet at twelvelabs.go:354-356). If a task creation ever returns status=failed (acknowledged as rare in the comment at line 188-193), the caller gets no server-side context about why it failed, making debugging harder.

Suggested fix: Either add raw-body capture to doTaskPost (mirroring doTaskGet), or pass the structured TaskCreateResponse fields through the error so the caller gets whatever the server returned.

Latent bug: `WithAudioEmbeddingOption("fused")` + `WithAsyncPolling(...)` accepted at construction, fails at embed time

option.go:89-98 accepts "fused" as a valid audio option. twelvelabs_async.go:31-34 rejects it on the async path. This means a client constructed with both options will successfully create without error, then fail at the first async audio embed call:

// Succeeds at construction
ef, _ := NewTwelveLabsEmbeddingFunction(
    WithAPIKey("..."),
    WithAudioEmbeddingOption("fused"),
    WithAsyncPolling(10 * time.Minute),
)

// Fails here at runtime, potentially minutes into a pipeline
emb, err := ef.EmbedContent(ctx, audioContent)

"fused" is legitimately valid on the sync path, so the option itself isn't wrong — but the combination with async is a misconfiguration that could be caught at NewTwelveLabsClient validation time (in validate()), surfacing it before any work is done.

Everything else looks solid — the per-call deadline derivation with distinct error classification (SDK maxWait vs. parent context), url.PathEscape on the task ID, bounded response reads via ReadLimitedBody, and the SanitizeErrorBody usage on all error paths are all good patterns.

…ime failure reason - Add FailureDetail field to TaskCreateResponse and populate it in doTaskPost when a 2xx response carries status=failed. The async.go error path now surfaces the sanitized server reason via sanitizeTaskFailureDetail, matching the GET /tasks symmetry (D-17) so rare create-time terminal failures are debuggable instead of returning just the task ID. - Reject WithAudioEmbeddingOption("fused") + WithAsyncPolling(...) in validate() at NewTwelveLabsEmbeddingFunction time. The async /tasks endpoint only accepts "audio" and "transcription" (RESEARCH F-02); previously this combination constructed successfully and failed at first EmbedContent — now it fails loudly at construction. Runtime defense-in-depth check in contentToAsyncRequest is retained.

claude · 2026-04-14T19:30:12Z

Code Review: Phase 26 — Twelve Labs Async Embedding

Overall this is a well-structured addition with good test coverage (24+ async tests), proper context propagation, and careful error differentiation. A few issues worth addressing:

Bug: Inconsistent HTTP status code acceptance between sync and async paths

doPost (twelvelabs.go:264) only accepts 200 OK:

if resp.StatusCode != http.StatusOK {

While doTaskPost (line 307) and doTaskGet (line 353) accept any 2xx:

if resp.StatusCode < 200 || resp.StatusCode >= 300 {

If the Twelve Labs API ever returns 201 Created for task creation (common REST convention), doPost would reject it but the async endpoints would accept it. This inconsistency should be unified — either all should accept 2xx or all should accept only specific status codes.

Bug: Default HTTP client has no timeout

applyDefaults (twelvelabs.go:60) creates a bare &http.Client{} with no timeout. While the async path mitigates this with per-call context deadlines derived from maxWait, the sync doPost path has no such protection — a hung Twelve Labs server would block the caller indefinitely. Consider adding a default timeout (e.g., 30s or 60s) to the fallback client:

c.Client = &http.Client{Timeout: 30 * time.Second}

Potential issue: `nextBackoff` overflow for extreme multiplier values

nextBackoff (twelvelabs_async.go:140) converts time.Duration to float64 and back:

next := time.Duration(float64(cur) * mul)

While the multiplier is validated to be >= 1.0, there's no upper bound validation. An extremely large multiplier (e.g., math.MaxFloat64) could overflow time.Duration (int64) producing a negative value, which would bypass the cap check and create a negative sleep. The cap comparison saves this in practice since the cap is validated, but adding if next <= 0 { next = backoffCap } would be more defensive.

Minor: `sanitizeTaskFailureDetail` — JSON-object guard logic

In twelvelabs.go:578:

var probe map[string]any
if len(body) > 0 && json.Unmarshal(body, &probe) != nil {
    // non-JSON bodies — surface them
}

This only surfaces non-JSON bodies. If the body is valid JSON but has no known diagnostic fields (e.g., {"_id":"x","status":"failed","some_unknown_field":"useful_info"}), the raw body is discarded and the generic "(no failure detail provided)" is returned. This is documented as intentional (avoiding leaking housekeeping fields), but genuinely useful server-side fields not in the known-key list (message, failure_reason, reason, detail, error) will be silently lost.

Positive observations

URL validation in buildMediaSource correctly blocks file://, javascript:, and other non-HTTP schemes — good SSRF defense-in-depth
url.PathEscape(taskID) in doTaskGet prevents path traversal
chttp.SanitizeErrorBody is used consistently to prevent credential/token leakage in error messages
FailureDetail using json:"-" + manual byte copy avoids both re-marshaling data loss and buffer aliasing bugs
The dual-deadline design (parent ctx deadline vs SDK maxWait) with distinct error messages is well-implemented
ErrAsyncMaxWaitExceeded is correctly kept separate from context.DeadlineExceeded for caller introspection
Test coverage is thorough — including edge cases like empty task IDs, ready-on-create, blocked HTTP calls, and config round-trips through JSON

tazarov added 30 commits April 14, 2026 09:01

docs(26): capture phase context

50b5ef3

docs(state): record phase 26 context session

66bee2b

docs(26): research phase twelvelabs async embedding

38b4a76

docs(26): add validation strategy

860627e

docs(26): create phase plan

3cefba9

docs(26): cross-AI review

03f9bd4

docs(26-01): complete async foundation plan

9da25cb

feat(26-03): add WithAsyncPolling option

4f24085

- expose single public async trigger (D-03, D-04) - maxWait=0 selects 30-minute default - reject negative maxWait with explicit error - sets asyncPollingEnabled=true and asyncMaxWait on client

docs(26-03): complete async option + config round-trip plan

f94359f

- document WithAsyncPolling public surface and round-trip keys - note malformed-value → async-off decision (no silent 30-min fallback) - record reuse of embeddings.ConfigInt helper across numeric JSON types

docs(26-02): complete async polling loop and modality routing plan

608d007

chore: merge executor worktree (26-02)

cd59768

docs(26-04): complete async test coverage plan

454ad3d

docs(26): add code review report

ee78e21

docs(phase-26): complete phase execution

ef336c2

fix(26): WR-01 bound async task create call by maxWait

598f823

test(26): WR-02 add doTaskPost non-2xx error regression tests

7322514

docs(26): add code review fix report

e642f51

test(26): complete UAT - 8 passed, 0 issues

5461aa8

tazarov temporarily deployed to Test April 14, 2026 13:42 — with GitHub Actions Inactive

tazarov temporarily deployed to Test April 14, 2026 17:48 — with GitHub Actions Inactive

tazarov temporarily deployed to Test April 14, 2026 17:49 — with GitHub Actions Inactive

tazarov had a problem deploying to Test April 14, 2026 17:49 — with GitHub Actions Failure

tazarov temporarily deployed to Test April 14, 2026 17:49 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase 26: Twelve Labs Async Embedding#509

Phase 26: Twelve Labs Async Embedding#509
tazarov wants to merge 42 commits into
mainfrom
feat/phase-26-twelve-labs-async-embedding

tazarov commented Apr 14, 2026 •

edited

Loading

Uh oh!

claude Bot commented Apr 14, 2026

Uh oh!

claude Bot commented Apr 14, 2026

Uh oh!

claude Bot commented Apr 14, 2026

Uh oh!

claude Bot commented Apr 14, 2026

Uh oh!

claude Bot commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tazarov commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Plan 26-01: Async HTTP foundation

Plan 26-02: Polling loop and modality routing

Plan 26-03: Public option and config round-trip

Plan 26-04: Async test coverage

Requirements Addressed

Verification

Key Decisions

Uh oh!

claude Bot commented Apr 14, 2026

Code Review: Phase 26 — Twelve Labs Async Embedding

Bug: context.Canceled handling discards original error and risks nil return

Minor: Config round-trip truncation on 32-bit platforms

Note: PR description outdated

Positive observations

Uh oh!

claude Bot commented Apr 14, 2026

PR Review: Phase 26 — Twelve Labs Async Embedding

Bug: createTaskAndPoll does not handle status=failed from task creation

Security (defense-in-depth): No URL scheme allowlist in buildMediaSource

Minor: Early-ready path silently falls through to polling when data is missing

Uh oh!

claude Bot commented Apr 14, 2026

Code Review: Phase 26 — Twelve Labs Async Embedding

Bugs / Issues

Minor / Informational

What looks good

Uh oh!

claude Bot commented Apr 14, 2026

PR Review — Phase 26: Twelve Labs Async Embedding

Bug: Task-creation failure discards server diagnostic detail

Latent bug: WithAudioEmbeddingOption("fused") + WithAsyncPolling(...) accepted at construction, fails at embed time

Uh oh!

claude Bot commented Apr 14, 2026

Code Review: Phase 26 — Twelve Labs Async Embedding

Bug: Inconsistent HTTP status code acceptance between sync and async paths

Bug: Default HTTP client has no timeout

Potential issue: nextBackoff overflow for extreme multiplier values

Minor: sanitizeTaskFailureDetail — JSON-object guard logic

Positive observations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tazarov commented Apr 14, 2026 •

edited

Loading

Bug: `context.Canceled` handling discards original error and risks nil return

Bug: `createTaskAndPoll` does not handle `status=failed` from task creation

Security (defense-in-depth): No URL scheme allowlist in `buildMediaSource`

Latent bug: `WithAudioEmbeddingOption("fused")` + `WithAsyncPolling(...)` accepted at construction, fails at embed time

Potential issue: `nextBackoff` overflow for extreme multiplier values

Minor: `sanitizeTaskFailureDetail` — JSON-object guard logic