fal: typed image/video helpers + queue polling realism#171
Conversation
2d5bd72 to
506fef2
Compare
commit: |
jpr5
left a comment
There was a problem hiding this comment.
CR Review — 7-agent round, 4 findings fixed
Nice contribution @tombeckenham — the queue polling realism is a solid addition. We ran a full 7-agent CR pass and found 4 bugs that need fixing before merge. I've prepared the fixes but couldn't push to your fork (permissions). Here's what needs to change, all in src/fal.ts:
1. IN_PROGRESS state skipped when only pollsBeforeInProgress is set
resolveProgression defaults pollsBeforeCompleted to Math.max(pollsBeforeInProgress, 0), making both thresholds equal. Then advanceJob checks COMPLETED before IN_PROGRESS, so the job jumps IN_QUEUE → COMPLETED, never passing through IN_PROGRESS.
Fix: Default pollsBeforeCompleted to pollsBeforeInProgress + 1 when not explicitly set. Reverse the check order in advanceJob — check IN_PROGRESS first, COMPLETED second.
2. imageItemToFalImage / videoResponseToFalJson produce invalid MIME types
url.split(".").pop() splits on ALL dots including hostname dots. For https://example.com/image, produces "com/image" → content_type: "image/com/image". Also doesn't strip URL fragments.
Fix: Extract the last path segment first (url.split("?")[0].split("#")[0].split("/").pop()), then find the extension via lastIndexOf(".").
3. Double-cancel pushes duplicate log entry
No guard for already-CANCELLED state — repeated cancel calls accumulate spurious "Job cancelled." logs.
Fix: Add if (job.status === "CANCELLED") guard returning 200 with { status: "CANCELLED" } without another log push.
4. parseBody silently returns null on malformed JSON
Fal queue submit/sync-run silently degrades to empty prompt instead of returning 400. Inconsistent with every other handler in the codebase.
Fix: Have parseBody throw on parse failure, catch in the submit/run handler and return 400 with error detail.
I have a commit with all 4 fixes ready — if you enable maintainer edits or want me to push it a different way, let me know. Otherwise you can apply the changes yourself from the descriptions above. All 2849 tests still pass after the fixes.
Note on maintainer editsWe tried to push our fix commits directly to your PR branch but hit a known GitHub platform limitation: "Allow edits from maintainers" is silently ignored for organization-owned forks (community discussion #5634, #9921). Your fork lives under For future PRs: if you fork from your personal account ( For this PR, you can apply the 4 fixes from the review above, or let us know and we'll cherry-pick your commit onto a CopilotKit-owned branch and open a replacement PR with the fixes included (crediting your original commit). |
Closes CopilotKit#170. The general fal handler (CopilotKit#152) accepts opaque `RawJSONResponse` payloads and always reports `COMPLETED` on submit. Two follow-up gaps this PR closes: 1. Typed helpers for image and video. Callers building fal mocks against the real wire shape currently hand-write the envelope every time (`{ images: [{ url, width, height, content_type }], timings, seed, has_nsfw_concepts, prompt }` for image; `{ video: { url, content_type, file_name, file_size }, seed }` for video). Add `LLMock.onFalImage(prompt, ImageResponse, opts?)` and `LLMock.onFalVideo(prompt, VideoResponse, opts?)` that accept the same response shapes used by `onImage` / `onVideo` and translate them into the fal envelope under the hood. Both delegate to `onFalQueue` so recording, matching, and lifecycle behavior stay identical. The converters live in `src/fal.ts` next to the queue logic. 2. Queue polling realism. Opt in via `MockServerOptions.falQueue: { pollsBeforeInProgress, pollsBeforeCompleted }`. When set, jobs advance `IN_QUEUE → IN_PROGRESS → COMPLETED` over the configured number of `/status` (or `/{id}`) polls. Status responses include `logs: [{ timestamp, level, message }]` (one entry per state transition) and `metrics.inference_time` (wall-clock elapsed since submit) once `COMPLETED`. Cancel before completion returns `200 { status: "CANCELLED" }`; after still returns `400 { status: "ALREADY_COMPLETED" }`. Result fetch before completion returns `202` with the current status body (was previously unreachable). Defaults preserve the existing instant-complete behavior so existing tests and fixtures remain green without change. Tests: 8 new cases (image/video lifecycles, sync run with image envelope, polling progression with logs+metrics, result-before-complete, cancel-before and cancel-after, ImageItem URL fallback). All 2849 tests pass. Docs: new "Typed Helpers" and "Polling Realism" sections in docs/fal-ai/. README "Multimedia APIs" line includes fal.ai.
tombeckenham
left a comment
There was a problem hiding this comment.
Good point. Let me move this. I was half way through this. Thanks for being proactive
The recorder shortcut wrote the IN_QUEUE submit envelope to fixtures and
never seeded falQueueStates, so replay's GET /requests/<id> returned the
envelope instead of the model output and broke fal.subscribe() consumers.
The recorder now walks submit → status → result upstream and persists
the final body. sync-run keeps the single-call generic recorder. Ported
the same fix to the legacy /fal/queue/submit/{model} audio path.
Adds RecordConfig.fal.{pollIntervalMs, timeoutMs} (defaults 1s / 120s),
extracts persistFixture from proxyAndRecord, and exports
buildFixtureMatch so the new walker stays consistent with the generic
recording path.
506fef2 to
dbae35c
Compare
Video generations (kling, veo, runway, etc.) routinely take 5-10 minutes on the upstream queue. The previous 2 min default tripped legit recording flows before the job could complete.
- Default pollsBeforeCompleted to pollsBeforeInProgress + 1 when only the in-progress threshold is set, and reverse advanceJob check order so equal thresholds still pass through IN_PROGRESS. - Extract URL extension from the last path segment with query string and fragment stripped — fixes invalid content_type for URLs like https://example.com/image (was image/com/image). - Guard cancel handler against double-cancel so a repeated PUT returns 200 CANCELLED without appending a duplicate log entry. - parseBody now throws on malformed JSON; submit/sync-run handler catches and returns 400 invalid_json, matching every other handler in the codebase.
|
Thanks for the thorough review! Pushed 37dd3fb addressing all four issues: 1. IN_PROGRESS skipped when only
2. Invalid MIME types from naive URL splitting (
3. Double-cancel duplicate log entry (
4.
Also added 5 regression tests in |
Addresses follow-up review feedback on the queue handler fixes: - Fragment test now uses a fragment-only URL — the previous `?q=#frag` variant was identically handled by the old buggy code (query truncation also killed the fragment), so the test wasn't actually exercising the bug it claimed to. - Add video URL extension coverage (missing-extension + fragment cases) — the `videoResponseToFalJson` refactor was previously only happy-pathed. - Add equal-thresholds test pinning the advanceJob if/else reorder. - Restore the `Math.max` clamp in resolveProgression: an explicit pollsBeforeCompleted lower than pollsBeforeInProgress now clamps up, matching the pre-existing constraint documented in types.ts. - Update FalQueueConfig JSDoc to reflect the new default-derivation rules and clamp behavior.
…alid JSON - has_nsfw_concepts is an array (one per image), not a scalar - Document pollsBeforeCompleted auto-default behavior - Document clamping when pollsBeforeCompleted < pollsBeforeInProgress - Add invalid JSON error row to Queue Lifecycle table
Closes #170.
This branch was originally written against the pre-#152 fal handler. Upstream landed the general handler with host-aware routing and
RawJSONResponsesupport before I opened the PR, so I've rebased and trimmed the scope to what's still missing: typed helpers and queue polling realism.What
1. Typed helpers —
onFalImage/onFalVideoThe current
onFalQueue(modelOrPrompt, response: unknown, opts?)takes opaque JSON. Callers building fal mocks against the real wire shape hand-write the envelope every time:This PR adds two helpers that accept the same
ImageResponse/VideoResponseshapes used byonImage/onVideoand translate to fal's wire shape under the hood:Both delegate to
onFalQueueso recording, matching, and lifecycle behavior stay identical.2. Queue polling realism
falQueueStatesalways stores jobs asCOMPLETEDon submit (src/fal.tspre-PR line 421). Production code that polls/statusand reacts toIN_QUEUE/IN_PROGRESS(e.g. queue position decay, log streaming, latency metrics) can't be exercised end-to-end against aimock today.Opt in with
MockServerOptions.falQueue: { pollsBeforeInProgress, pollsBeforeCompleted }:IN_QUEUE(when thresholds > 0) withqueue_position./status(or/{id}) poll advancesIN_QUEUE → IN_PROGRESS → COMPLETED.logs: [{ timestamp, level, message }](one entry per transition) andmetrics.inference_time(wall-clock elapsed since submit) onceCOMPLETED.200 { status: "CANCELLED" }; cancel after →400 { status: "ALREADY_COMPLETED" }(unchanged).202with current status body.Defaults preserve existing behavior: both thresholds default to
0so unconfigured mocks still reportCOMPLETEDon submit, no test churn.Out of scope (already shipped or follow-up)
x-fal-target-host) — already in upstreamfal.ts.rest.alpha.fal.ai/storage/upload/initiate— already in upstreamfal.ts.Test plan
src/__tests__/fal.test.ts: image lifecycle, video lifecycle, sync run with image envelope, pollingIN_QUEUE → IN_PROGRESS → COMPLETEDwithlogs[]+metrics.inference_time, result-before-complete (202), cancel-before-complete (CANCELLED), cancel-after-complete (ALREADY_COMPLETED),ImageItemURL fallback.pnpm test— 2849 passed, 37 skipped, no regressions.pnpm run format:checkclean.pnpm run lintclean.Files
src/fal.ts— typed-response converters (imageResponseToFalJson,videoResponseToFalJson); queue progression state machine (advanceJob,queuePosition,statusResponseBody);FalQueueJobextended withpollCount,pollsBeforeInProgress,pollsBeforeCompleted,submittedAt,completedAt,logs.src/llmock.ts—onFalImage,onFalVideo.src/types.ts—FalQueueConfig;MockServerOptions.falQueue;HandlerDefaults.falQueue.src/server.ts— propagatefalQueueto handler defaults.src/__tests__/fal.test.ts— new describe block for typed helpers + polling.docs/fal-ai/index.html— new sections "Typed Helpers" and "Polling Realism"; queue lifecycle table now mentions logs/metrics/CANCELLED.README.md— fal.ai added to the Multimedia APIs line.Follow-up: 15 min queue-walk default (commit 723fa5c)
Bumped
DEFAULT_FAL_TIMEOUT_MSfrom 2 min → 15 min. Video generations (kling, veo, runway, etc.) routinely take 5–10 minutes on the upstream queue; the prior default was tripping legitimate recording flows mid-walk. Callers needing a tighter budget can still override viarecord.fal.timeoutMs. Existing recorder tests pin their owntimeoutMs(2000/5000ms) so the bumped default is exercised only via callers that omit it.pnpm testnow: 3013 passed.Verification
Verified end-to-end against the OpenStory app — recording a kling-video/v3/pro image-to-video job now completes the queue walk and persists the final job body as expected.