Skip to content

Phase 26: Twelve Labs Async Embedding#509

Open
tazarov wants to merge 42 commits into
mainfrom
feat/phase-26-twelve-labs-async-embedding
Open

Phase 26: Twelve Labs Async Embedding#509
tazarov wants to merge 42 commits into
mainfrom
feat/phase-26-twelve-labs-async-embedding

Conversation

@tazarov

@tazarov tazarov commented Apr 14, 2026

Copy link
Copy Markdown
Contributor

Summary

Phase 26: Twelve Labs Async Embedding
Goal: Twelve Labs provider handles async task responses for long-running audio and video embeddings
Status: Verified on 2026-04-14 (26-VERIFICATION.md status: passed)

This phase adds an explicit async path for Twelve Labs long-running media embeddings. Audio and video callers can now opt in with WithAsyncPolling(maxWait) to create a task, poll until it reaches a terminal state, and return the final embedding or a sanitized failure. Text/image callers and all non-opt-in traffic remain on the existing synchronous path.

Changes

Plan 26-01: Async HTTP foundation

Added dedicated async request/response types for the tasks API, decoded the _id alias correctly, preserved raw task failure bodies for later sanitization, and introduced internal polling configuration fields/defaults plus task POST/GET helpers.

Key files:

  • pkg/embeddings/twelvelabs/twelvelabs.go

Plan 26-02: Polling loop and modality routing

Implemented the async polling loop with capped backoff, distinct cancellation/deadline/maxWait behavior, failure-body sanitization, fused-audio rejection on the async path, and audio/video routing into the tasks flow while leaving text/image unchanged.

Key files:

  • pkg/embeddings/twelvelabs/twelvelabs_async.go
  • pkg/embeddings/twelvelabs/content.go

Plan 26-03: Public option and config round-trip

Added the public WithAsyncPolling(maxWait) option and made async config persist/rebuild through async_polling and async_max_wait_ms without changing default behavior for callers who do not opt in.

Key files:

  • pkg/embeddings/twelvelabs/option.go
  • pkg/embeddings/twelvelabs/twelvelabs.go

Plan 26-04: Async test coverage

Added 12 async-focused tests covering task creation, ready/failed/unexpected task states, context cancellation, SDK maxWait behavior, sync-path preservation for text/image, config round-trip, raw failure sanitization, blocked HTTP maxWait enforcement, and fused-mode rejection.

Key files:

  • pkg/embeddings/twelvelabs/twelvelabs_test.go

Requirements Addressed

  • TLA-01: Twelve Labs provider detects async task responses and enters a polling loop
  • TLA-02: Async polling respects caller context for cancellation and timeout
  • TLA-03: Async polling handles terminal states with appropriate result delivery or error messages
  • TLA-04: Tests cover async task creation, polling, completion, failure, and context cancellation

Verification

  • Automated verification passed (26-VERIFICATION.md, 4/4 must-haves verified on 2026-04-14)
  • go test -tags=ef -count=1 ./pkg/embeddings/twelvelabs/...
  • make lint
  • Human verification not required for this library change

Key Decisions

  • Async stays opt-in through WithAsyncPolling(maxWait); polling internals remain hidden.
  • Audio/video route through the tasks API; text/image always stay on sync /embed-v2.
  • Async audio request payloads use embedding_option as []string, and task responses decode Twelve Labs’ _id field.
  • Failed-task errors sanitize the authentic raw response body instead of a re-marshaled subset.
  • fused audio is rejected on the async path instead of being silently remapped.

tazarov added 30 commits April 14, 2026 09:01
Research F-01 collapsed the task API to two endpoints (POST /tasks and
GET /tasks/{id}); the in-scope bullet still listed a third /status
sub-path that does not exist. Aligns CONTEXT with RESEARCH and with the
plans to prevent executor drift.
…ayload

Adds a `FailureDetail json.RawMessage` field to TaskResponse (json:"-")
and has doTaskGet populate it with the raw response bytes on each GET
/tasks/{id}. This lets Plan 02's failed-status branch sanitize the
authentic server-provided reason instead of re-marshaling the parsed
struct subset, which was dropping undeclared reason fields (codex HIGH,
D-17 compliance).
…used rejection

- Per-HTTP-call deadline (codex HIGH): every doTaskGet is now wrapped in
  context.WithDeadline(ctx, min(parentDL, sdkMaxWaitDeadline)) so a
  blocked/hung HTTP request cannot outlive maxWait (D-09 hard bound).
  Translates deadline-source back into distinct errors — SDK maxWait
  vs parent ctx.DeadlineExceeded (D-20).
- Fused-on-async rejection (codex/gemini HIGH): contentToAsyncRequest
  now rejects audioOpt == "fused" with a deterministic validation
  error; async endpoint accepts only "audio" and "transcription"
  (RESEARCH F-02 / A5).
- Failed-reason authenticity (codex HIGH): failed-status branch uses
  resp.FailureDetail (raw server body from Plan 01) rather than
  re-marshaling the parsed TaskResponse (D-17).
- Distinct ctx-cancel vs deadline wording (codex MEDIUM): context.Canceled
  wraps as "async polling canceled"; context.DeadlineExceeded wraps as
  "async polling deadline exceeded".
… malformed

- Use the existing embeddings.ConfigInt helper (pkg/embeddings/embedding.go:674)
  instead of reimplementing int/int64/float64 switching (codex LOW).
- Distinguish missing key from malformed value: async_polling=true AND
  malformed async_max_wait_ms no longer silently falls back to
  WithAsyncPolling(0) (which would enable async with the 30-minute
  default on broken config). The flag stays off rather than enabling
  on a broken round-trip (codex MEDIUM).
…tests

- Fix compile blocker: replace all 5 embeddings.ContentPart references
  with embeddings.Part (codex HIGH/compile-blocker — multimodal.go:61
  defines Part, there is no ContentPart in the embeddings package).
- Add TestTwelveLabsAsyncFailedReasonSanitized: long-reason failed-task
  fixture asserts the authentic server reason reaches the error and is
  sanitized (ratchets D-17 via a runtime test).
- Add TestTwelveLabsAsyncBlockedHTTPMaxWait: blocking httptest handler
  proves the per-HTTP-call deadline added in Plan 02 actually unblocks
  an in-flight GET when maxWait fires.
- Add TestTwelveLabsAsyncFusedRejected: WithAudioEmbeddingOption("fused")
  + WithAsyncPolling + audio content must return a deterministic
  validation error before any HTTP call.
- Assert APIKeyEnvVar survives the config round-trip (gemini review).
- Add AsyncEmbedV2Request with AsyncAudioInput (embedding_option as []string) and AsyncVideoInput
- Add TaskCreateResponse and TaskResponse with Mongo-style `_id` JSON alias
- TaskResponse.FailureDetail (json.RawMessage, excluded from JSON) preserves raw body for D-17 sanitization
- Extend TwelveLabsClient with unexported async polling fields
- applyDefaults sets initial=2s, multiplier=1.5, cap=60s per D-11
- doTaskPost: POST {BaseAPI}/tasks with JSON body; mirrors doPost headers and error sanitization
- doTaskGet: GET {BaseAPI}/tasks/{id} with URL-escaped task ID; empty-ID guard rejects the Mongo-alias footgun
- Both use chttp.ReadLimitedBody + chttp.SanitizeErrorBody (Phase 25 convention)
- doTaskGet preserves raw response bytes in TaskResponse.FailureDetail (D-17) so Plan 02 can sanitize the authentic server failure reason
- Lint-silenced with //nolint:unused until Plans 26-02/03 wire the polling loop and option
- expose single public async trigger (D-03, D-04)
- maxWait=0 selects 30-minute default
- reject negative maxWait with explicit error
- sets asyncPollingEnabled=true and asyncMaxWait on client
- GetConfig emits async_polling and async_max_wait_ms only when enabled (D-21, D-22)
- FromConfig reads async_polling (bool) and async_max_wait_ms via embeddings.ConfigInt
- malformed async_max_wait_ms leaves async disabled (no silent 30-min fallback)
- missing keys = async off (opt-in only via WithAsyncPolling, D-23)
- pollTask: capped exponential backoff polling loop using time.NewTimer
  (no time.After leak per Pitfall 2) with per-HTTP-call deadline bounded
  by min(parent ctx deadline, SDK maxWait deadline) so a blocked
  doTaskGet cannot outlive maxWait (D-09 hard bound).
- Distinct error wording for ctx.Canceled, ctx.DeadlineExceeded, and
  SDK-side maxWait expiry (D-20). errors.Is still unwraps to stdlib
  sentinels via pkg/errors.
- Terminal status=failed uses raw server body from TaskResponse.FailureDetail
  through chttp.SanitizeErrorBody — no re-marshaled struct subset (D-17).
- D-16 default branch rejects unknown status values.
- contentToAsyncRequest rejects audio embedding option 'fused' on the
  async path per RESEARCH F-02 / Assumption A5 (async accepts only
  'audio' and 'transcription').
- createTaskAndPoll orchestrates build -> POST -> optional early-ready
  short-circuit -> poll -> extract embedding.
- //nolint:unused annotations on the symbols until Task 2 wires routing
  in content.go (pattern from Plan 26-01).
- document WithAsyncPolling public surface and round-trip keys
- note malformed-value → async-off decision (no silent 30-min fallback)
- record reuse of embeddings.ConfigInt helper across numeric JSON types
…abled

- embedContent now dispatches to createTaskAndPoll when
  asyncPollingEnabled is true AND the single part is audio or video
  (CONTEXT.md D-07). Text/image always take the sync path (D-07).
- When asyncPollingEnabled is false, all four modalities continue on
  the sync doPost path — zero behavior change for non-opt-in callers
  (D-08).
- Removed //nolint:unused annotations from twelvelabs_async.go — all
  symbols are now reachable from embedContent via the routing switch.
…ed-status tests

- Add newTestAsyncEF helper with ms-scale polling intervals
- Add audioContent/videoContent + taskCreateJSON/taskGetJSON fixtures
- Fixtures emit _id (not id) to guard Pitfall 1 alias gotcha
- Cover 4 of 6 D-26 flows: create, poll-to-ready, poll-to-failed, unexpected-status
…rip, and review-fix tests

- TestTwelveLabsAsyncCtxCancel: proves errors.Is(err, context.Canceled) (D-10/D-19)
- TestTwelveLabsAsyncMaxWait: proves 'async polling maxWait' distinct from DeadlineExceeded (D-20)
- TestTwelveLabsAsyncSkipsTextImage: text/image never hit /tasks even when async is on (D-07)
- TestTwelveLabsAsyncConfigRoundTrip: async_polling/async_max_wait_ms + APIKeyEnvVar survive round-trip (D-23)
- TestTwelveLabsAsyncConfigOmitWhenDisabled: async keys omitted when disabled (D-22)
- TestTwelveLabsAsyncFailedReasonSanitized: raw server reason survives sanitization (D-17)
- TestTwelveLabsAsyncBlockedHTTPMaxWait: per-call deadline unblocks hanging GET within maxWait
- TestTwelveLabsAsyncFusedRejected: fused+async rejected before POST /tasks (F-02)
@claude

claude Bot commented Apr 14, 2026

Copy link
Copy Markdown

Code Review: Phase 26 — Twelve Labs Async Embedding

Bug: context.Canceled handling discards original error and risks nil return

In both pollTask (twelvelabs_async.go:88) and createTaskAndPoll (twelvelabs_async.go:173), the context.Canceled branch wraps ctx.Err() instead of the original transport error err:

if errors.Is(err, context.Canceled) {
    return nil, errors.Wrap(ctx.Err(), "Twelve Labs async polling canceled")
}

This is inconsistent with the context.DeadlineExceeded branch directly above, which correctly preserves the original error:

return nil, errors.Wrapf(err, "Twelve Labs async polling deadline exceeded")

Two problems:

  1. Lost debug context: The original err contains the full transport error chain (e.g., "failed to send task retrieve request to .../tasks/abc: context canceled"). Wrapping only ctx.Err() discards this, making cancel-related failures harder to diagnose.

  2. Potential nil error / nil-pointer panic: pkg/errors.Wrap(nil, msg) returns nil. If errors.Is(err, context.Canceled) is true but ctx.Err() is nil (possible with custom transports or derived sub-contexts), this returns (nil, nil). The caller then dereferences the nil *TaskResponse at resp.Status (pollTask) or nil *TaskCreateResponse (createTaskAndPoll), causing a panic — violating the library's no-panic contract from CLAUDE.md.

Fix: Use errors.Wrap(err, ...) instead of errors.Wrap(ctx.Err(), ...) in both locations, matching the DeadlineExceeded branch pattern.


Minor: Config round-trip truncation on 32-bit platforms

GetConfig() stores asyncMaxWait.Milliseconds() as int64, but NewTwelveLabsEmbeddingFunctionFromConfig reads it via embeddings.ConfigInt() which casts int64int. On 32-bit systems, maxWait values exceeding ~24.8 days would silently truncate. Not a practical concern for the 30-minute default, but worth noting if the API evolves.


Note: PR description outdated

The PR description lists a known issue: "Initial doTaskPost is not yet bounded by asyncMaxWait." However, the implementation in createTaskAndPoll (lines 149–156) already bounds it with a derived deadline context, and TestTwelveLabsAsyncMaxWaitHardBound verifies this. The description should be updated to reflect that this was resolved.


Positive observations

  • Consistent use of url.PathEscape(taskID) prevents path injection from server-controlled task IDs.
  • All response bodies read through chttp.ReadLimitedBody — no unbounded reads.
  • Error bodies consistently sanitized via chttp.SanitizeErrorBody.
  • Backoff timer correctly stopped on context cancellation (no timer leak).
  • FailureDetail correctly copies response bytes to avoid buffer reuse issues.
  • Good test coverage including edge cases (blocked HTTP, fused rejection, empty ID, early-ready).

… transport errors

- .github/workflows/go.yml: pass GITHUB_TOKEN to both crosslang test
  steps so pure-onnx's bootstrap can use authenticated GitHub API
  requests. The cross-language tests auto-wire the default EF from
  server collection config, which triggers ONNX runtime bootstrap;
  anonymous GitHub API quota (60 req/hr per runner IP) was causing
  intermittent failures surfacing as "embedding function is required"
  when bootstrap failed silently during auto-wire.
- twelvelabs_async.go: mirror the context.DeadlineExceeded branch and
  wrap the original transport error (err) in the context.Canceled
  branches of pollTask and createTaskAndPoll. Previous code wrapped
  ctx.Err() which both discarded the transport chain and risked
  (nil, nil) returns from pkg/errors.Wrap(nil, ...), violating the
  library's no-panic contract.
@claude

claude Bot commented Apr 14, 2026

Copy link
Copy Markdown

PR Review: Phase 26 — Twelve Labs Async Embedding

Overall this is well-structured with thorough test coverage and careful deadline/cancellation handling. A few issues found:

Bug: createTaskAndPoll does not handle status=failed from task creation

In twelvelabs_async.go:181, only taskStatusReady is handled as an early return from the create response:

if created.Status == taskStatusReady && len(created.Data) > 0 {
    return buildEmbeddingFromData(created.Data)
}

If the server returns a task with status: "failed" and a non-empty _id in the create response, the code falls through to pollTask, which fires an unnecessary GET request. Worse, the failure details from the create response body are lost — doTaskPost does not capture FailureDetail like doTaskGet does, so the error message from the poll GET may differ from the original failure reason.

Suggested fix: Add a taskStatusFailed check after the ready check:

if created.Status == taskStatusFailed {
    return nil, errors.Errorf("Twelve Labs task [%s] failed at creation", created.ID)
}

Security (defense-in-depth): No URL scheme allowlist in buildMediaSource

content.go:128 validates that the URL has a scheme and host, which correctly rejects file://, data:, and javascript: URLs (empty host). However, schemes like ftp://, gopher://, or dict:// with a valid host would pass validation and be sent to the Twelve Labs API. If the API fetches these URLs server-side, this could enable SSRF via the client library.

Suggested fix: Add a scheme allowlist:

if !strings.EqualFold(parsed.Scheme, "http") && !strings.EqualFold(parsed.Scheme, "https") {
    return MediaSource{}, errors.Errorf("URL source scheme %q not supported; use http or https", parsed.Scheme)
}

Minor: Early-ready path silently falls through to polling when data is missing

twelvelabs_async.go:181 requires len(created.Data) > 0 alongside taskStatusReady. If the server returns status: "ready" with an empty data array, the code falls through to polling instead of returning an immediate error. This masks a malformed server response behind an extra network request that will likely produce the same result.


Test coverage is solid — 25+ async tests with good edge case coverage (blocked HTTP, parent deadline, fused rejection, config round-trip). The per-HTTP-call deadline wrapping in pollTask is particularly well done, with correct discrimination between SDK maxWait, parent deadline, and client timeout errors.

🤖 Generated with Claude Code

- createTaskAndPoll: treat status=failed on the POST /tasks response as
  a terminal error with a distinct message, and treat status=ready as
  terminal regardless of data shape (let buildEmbeddingFromData surface
  malformed responses). Avoids a wasteful GET round-trip in both cases.
- buildMediaSource: reject URL sources whose scheme is not http or
  https. Defense-in-depth against sending non-web schemes to the
  Twelve Labs API via the client.
- Add tests:
  - TestTwelveLabsAsyncCreateReturnsFailed pins the no-GET path for
    failed-on-create.
  - TestTwelveLabsAsyncCreateReturnsReadyEmptyData pins the no-GET
    path for malformed ready-with-no-data.
  - TestBuildMediaSourceURLValidation extends with ftp/gopher/file
    scheme rejection and plain http acceptance.
@claude

claude Bot commented Apr 14, 2026

Copy link
Copy Markdown

Code Review: Phase 26 — Twelve Labs Async Embedding

Overall this is a well-structured implementation. The polling loop, deadline propagation, and error classification are thorough. A few findings below.

Bugs / Issues

1. No minimum validation on asyncMaxWait (option.go:112)

WithAsyncPolling(1 * time.Nanosecond) is accepted but will immediately time out before any HTTP request completes. Consider enforcing a minimum (e.g., 1 second) to prevent confusing instant-timeout behavior.

2. SDK maxWait timeout errors are not programmatically distinguishable (twelvelabs_async.go:79)

errors.Errorf with %v intentionally breaks the error chain so errors.Is(err, context.DeadlineExceeded) returns false (D-20). But this means callers have no way to programmatically detect an SDK maxWait timeout other than string matching. Consider defining a sentinel error (e.g., var ErrAsyncMaxWaitExceeded) that callers can check with errors.Is, while still keeping it distinct from context.DeadlineExceeded.

3. Stale PR review note

The PR body states: "Initial doTaskPost is not yet bounded by asyncMaxWait". But the code does bound it -- createTaskAndPoll wraps the create call with context.WithDeadline derived from sdkMaxWaitDeadline (lines 149-158 of twelvelabs_async.go). This note should be removed to avoid confusing future reviewers.

Minor / Informational

4. Empty failure detail produces slightly odd error message

If a server returns {"status":"failed"} with no error/message/reason fields, sanitizeTaskFailureDetail falls through to dumping the raw JSON body. The error becomes Twelve Labs task [abc] terminal status=failed: {"_id":"abc","status":"failed"}. Consider returning a generic fallback like "(no failure detail provided)" when extractTaskFailureDetail finds no structured reason and body is non-nil.

5. Config round-trip test does not cover JSON serialization

TestTwelveLabsAsyncConfigRoundTrip passes the config map directly in-memory. If the config is persisted via JSON (e.g., a registry), int64 values decode as float64. ConfigInt handles this, but the test does not exercise that path. Adding a JSON marshal/unmarshal step to the round-trip test would lock in this guarantee.

What looks good

  • Deadline propagation through createTaskAndPoll -> pollTask correctly shares a single maxWait budget (D-09 hard bound)
  • Per-HTTP-call deadline (callCtx) prevents a hung GET from outliving maxWait
  • Error classification (D-20) correctly distinguishes parent context deadline, SDK maxWait, HTTP client timeout, and cancellation -- with good test coverage
  • url.PathEscape on task IDs prevents path traversal
  • FailureDetail uses json:"-" with manual population to preserve the authentic server response
  • SanitizeErrorBody on all user-visible error paths prevents log flooding
  • Fused-audio rejection on the async path is deterministic and tested

No critical bugs or security issues found.

- Reject sub-floor asyncMaxWait at option-application time (< 2s can't
  complete one poll cycle); extract defaultAsyncPollInitial constant so
  applyDefaults and WithAsyncPolling share a single floor source of truth.
- Introduce ErrAsyncMaxWaitExceeded sentinel wrapped via fmt.Errorf %w at
  all 4 maxWait-exceeded sites so callers can detect SDK-enforced timeouts
  via errors.Is without string matching. Chain-break from
  context.DeadlineExceeded preserved (D-20).
- sanitizeTaskFailureDetail returns "(no failure detail provided)" when
  extractTaskFailureDetail yields nothing and the body is a JSON object
  with no diagnostic fields, avoiding leakage of housekeeping fields
  (_id, status) into user-facing errors.
- Extend TestTwelveLabsAsyncConfigRoundTrip with json.Marshal/Unmarshal
  so the float64-coercion path used by registry-persisted configs is
  locked in.
@claude

claude Bot commented Apr 14, 2026

Copy link
Copy Markdown

PR Review — Phase 26: Twelve Labs Async Embedding

Overall this is a well-structured PR with solid timeout/cancellation handling and good test coverage. Two issues found:

Bug: Task-creation failure discards server diagnostic detail

twelvelabs_async.go:196 — When doTaskPost returns a response with status=failed, the error message includes only the task ID:

case taskStatusFailed:
    return nil, errors.Errorf("Twelve Labs async task [%s] terminal status=failed at creation", created.ID)

Compare with the polling path (twelvelabs_async.go:107) which preserves and sanitizes the server's failure reason:

return nil, errors.Errorf("Twelve Labs task [%s] terminal status=failed: %s", taskID, sanitizeTaskFailureDetail(resp.FailureDetail))

The TaskCreateResponse struct has no FailureDetail field and doTaskPost does not capture the raw response body for failed-status responses (unlike doTaskGet at twelvelabs.go:354-356). If a task creation ever returns status=failed (acknowledged as rare in the comment at line 188-193), the caller gets no server-side context about why it failed, making debugging harder.

Suggested fix: Either add raw-body capture to doTaskPost (mirroring doTaskGet), or pass the structured TaskCreateResponse fields through the error so the caller gets whatever the server returned.


Latent bug: WithAudioEmbeddingOption("fused") + WithAsyncPolling(...) accepted at construction, fails at embed time

option.go:89-98 accepts "fused" as a valid audio option. twelvelabs_async.go:31-34 rejects it on the async path. This means a client constructed with both options will successfully create without error, then fail at the first async audio embed call:

// Succeeds at construction
ef, _ := NewTwelveLabsEmbeddingFunction(
    WithAPIKey("..."),
    WithAudioEmbeddingOption("fused"),
    WithAsyncPolling(10 * time.Minute),
)

// Fails here at runtime, potentially minutes into a pipeline
emb, err := ef.EmbedContent(ctx, audioContent)

"fused" is legitimately valid on the sync path, so the option itself isn't wrong — but the combination with async is a misconfiguration that could be caught at NewTwelveLabsClient validation time (in validate()), surfacing it before any work is done.


Everything else looks solid — the per-call deadline derivation with distinct error classification (SDK maxWait vs. parent context), url.PathEscape on the task ID, bounded response reads via ReadLimitedBody, and the SanitizeErrorBody usage on all error paths are all good patterns.

…ime failure reason

- Add FailureDetail field to TaskCreateResponse and populate it in doTaskPost
  when a 2xx response carries status=failed. The async.go error path now
  surfaces the sanitized server reason via sanitizeTaskFailureDetail, matching
  the GET /tasks symmetry (D-17) so rare create-time terminal failures are
  debuggable instead of returning just the task ID.
- Reject WithAudioEmbeddingOption("fused") + WithAsyncPolling(...) in
  validate() at NewTwelveLabsEmbeddingFunction time. The async /tasks endpoint
  only accepts "audio" and "transcription" (RESEARCH F-02); previously this
  combination constructed successfully and failed at first EmbedContent —
  now it fails loudly at construction. Runtime defense-in-depth check in
  contentToAsyncRequest is retained.
@claude

claude Bot commented Apr 14, 2026

Copy link
Copy Markdown

Code Review: Phase 26 — Twelve Labs Async Embedding

Overall this is a well-structured addition with good test coverage (24+ async tests), proper context propagation, and careful error differentiation. A few issues worth addressing:

Bug: Inconsistent HTTP status code acceptance between sync and async paths

doPost (twelvelabs.go:264) only accepts 200 OK:

if resp.StatusCode != http.StatusOK {

While doTaskPost (line 307) and doTaskGet (line 353) accept any 2xx:

if resp.StatusCode < 200 || resp.StatusCode >= 300 {

If the Twelve Labs API ever returns 201 Created for task creation (common REST convention), doPost would reject it but the async endpoints would accept it. This inconsistency should be unified — either all should accept 2xx or all should accept only specific status codes.

Bug: Default HTTP client has no timeout

applyDefaults (twelvelabs.go:60) creates a bare &http.Client{} with no timeout. While the async path mitigates this with per-call context deadlines derived from maxWait, the sync doPost path has no such protection — a hung Twelve Labs server would block the caller indefinitely. Consider adding a default timeout (e.g., 30s or 60s) to the fallback client:

c.Client = &http.Client{Timeout: 30 * time.Second}

Potential issue: nextBackoff overflow for extreme multiplier values

nextBackoff (twelvelabs_async.go:140) converts time.Duration to float64 and back:

next := time.Duration(float64(cur) * mul)

While the multiplier is validated to be >= 1.0, there's no upper bound validation. An extremely large multiplier (e.g., math.MaxFloat64) could overflow time.Duration (int64) producing a negative value, which would bypass the cap check and create a negative sleep. The cap comparison saves this in practice since the cap is validated, but adding if next <= 0 { next = backoffCap } would be more defensive.

Minor: sanitizeTaskFailureDetail — JSON-object guard logic

In twelvelabs.go:578:

var probe map[string]any
if len(body) > 0 && json.Unmarshal(body, &probe) != nil {
    // non-JSON bodies — surface them
}

This only surfaces non-JSON bodies. If the body is valid JSON but has no known diagnostic fields (e.g., {"_id":"x","status":"failed","some_unknown_field":"useful_info"}), the raw body is discarded and the generic "(no failure detail provided)" is returned. This is documented as intentional (avoiding leaking housekeeping fields), but genuinely useful server-side fields not in the known-key list (message, failure_reason, reason, detail, error) will be silently lost.

Positive observations

  • URL validation in buildMediaSource correctly blocks file://, javascript:, and other non-HTTP schemes — good SSRF defense-in-depth
  • url.PathEscape(taskID) in doTaskGet prevents path traversal
  • chttp.SanitizeErrorBody is used consistently to prevent credential/token leakage in error messages
  • FailureDetail using json:"-" + manual byte copy avoids both re-marshaling data loss and buffer aliasing bugs
  • The dual-deadline design (parent ctx deadline vs SDK maxWait) with distinct error messages is well-implemented
  • ErrAsyncMaxWaitExceeded is correctly kept separate from context.DeadlineExceeded for caller introspection
  • Test coverage is thorough — including edge cases like empty task IDs, ready-on-create, blocked HTTP calls, and config round-trips through JSON

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant