Google Flow Agent — Gemini CLI Instructions

Base URL: http://127.0.0.1:8100

Pre-flight

Before ANY workflow:

curl -s http://127.0.0.1:8100/health
# Must return: {"extension_connected": true}

Critical Rules (MUST follow)

Media ID is always UUID — format xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx. Never use CAMS... / base64 strings.
Scene prompts = ACTION only — never describe character appearance. Reference images handle visual consistency via imageInputs.
All reference images must exist before scene images — verify every entity has media_id before generating scene images.
No throwaway scripts — NEVER write Python, shell, or any script file to loop over API requests. Use POST /api/requests/batch to submit all requests at once, then poll GET /api/requests/batch-status. The server throttles automatically.
Locations use landscape, characters use portrait — reference image orientation depends on entity type.
UUID extraction — if a response gives CAMS... instead of UUID, extract UUID from the fifeUrl in the response URL: /image/{UUID}?....
Cascade on regen — regenerating an image auto-clears downstream video + upscale.
REGENERATE vs GENERATE — GENERATE_* skips if already COMPLETED. REGENERATE_* always runs (clears + regenerates).
Image Material required — every project needs a material field (e.g. realistic, 3d_pixar, anime). List available: GET /api/materials.
Server handles throttling — worker enforces max 5 concurrent requests + 10s cooldown. Submit ALL requests via /batch; do NOT manually stagger or loop.
Video prompts use sub-clip timing — structure 8s video as time segments: 0-3s: [action]. 3-6s: [action]. 6-8s: [action].
Character dialogue in sub-clips — embed speech in quotes: Luna says "Goodnight." Max 10-15 words per character per 2-3s segment.
Scenes are mutable — use PATCH /api/scenes/{sid} to update prompt, video_prompt, narrator_text, character_names after creation. Don't delete and recreate — patch instead.
Fact-check before scripting — ALWAYS research events via web search before writing project stories, scene prompts, or narrator text. Facts (events, dates, names, operations, outcomes) MUST match real sources. Editorial opinion and analysis are allowed but must be framed as such. Never invent events, operation names, or statistics.
Real-people bypass — when characters are based on real famous people (politicians, celebrities, military leaders), NEVER use their real name as entity name or anywhere in description, image_prompt, prompt, or video_prompt. AI image generators reject known public figures. Instead: (a) use a role-based alias as entity name (e.g. "Tổng Tư Lệnh" not "Trump", "Thủ Tướng Sắt" not "Netanyahu"), (b) describe physical appearance only — distinctive hair, face shape, build, clothing style — without naming who it is, (c) narrator_text may use real titles/roles for storytelling but real names never flow into image/video generation. Keep a real_reference mapping in the project plan file (.omc/research/) for internal tracking.
Review before upscale — ALWAYS run /fk-review-video (light mode) after video generation, before upscaling. Scenes scoring < 7.5 get video_prompt updated from review errors, then regen video. Max 2 review-regen cycles.

Pipeline Order

0. Research          /fk-research "topic" (fact-check via web search, save to .omc/research/)
1. Health check      GET  /health → extension_connected: true
2. Create project    POST /api/projects (with entities + material, story from research)
3. Create video      POST /api/videos
4. Create scenes     POST /api/scenes (with character_names, chain_type)
5. Gen ref images    POST /api/requests/batch → poll /batch-status?project_id=<PID>
                     Wait for done=true, verify all entities have media_id
6. Gen scene images  POST /api/requests/batch → poll /batch-status?video_id=<VID>
                     Wait for done=true, verify image_media_id = UUID
7. Gen videos        POST /api/requests/batch → poll /batch-status?video_id=<VID>
                     Wait for done=true (videos take 2-5 min each)
7.5 Review videos    POST /api/videos/{vid}/review?mode=light (Claude Vision quality check)
                     Pass: score >= 7.5 | Fail: update video_prompt → regen → re-review (max 2 cycles)
8. (Optional) 4K     POST /api/requests/batch (TIER_TWO only)
9. (Optional) TTS    Create voice template → POST /api/videos/{vid}/narrate
10. Concat           ffmpeg normalize + concat

Batch API

Submit N requests at once (server throttles automatically — max 5 concurrent, 10s cooldown):

curl -X POST http://127.0.0.1:8100/api/requests/batch \
  -H "Content-Type: application/json" \
  -d '{"requests": [{"type": "...", "scene_id": "...", "project_id": "...", "video_id": "...", "orientation": "VERTICAL"}, ...]}'

Poll aggregate status:

curl -s "http://127.0.0.1:8100/api/requests/batch-status?video_id=<VID>&type=GENERATE_IMAGE"
# Returns: {"total": 40, "pending": 30, "processing": 5, "completed": 5, "failed": 0, "done": false}
# When "done": true → all requests have left the queue (completed or failed)
# When "all_succeeded": true → every request completed successfully

For full API reference, workflow recipes, and video prompt guidelines, see CLAUDE.md.

Skills

This project has reusable skills in skills/. When the user says /fk:<name>, read skills/fk:<name>.md and follow the instructions inside.

Skill	Purpose
`/fk-add-material`	fk-add-material — Image Material System
`/fk-brand-logo`	fk-brand-logo — Apply Channel Brand Logo to Video & Thumbnails
`/fk-camera-guide`	Camera Guide — Cinematic Video Prompts
`/fk-concat-fit-narrator`	Trim each scene video to fit its TTS narrator duration, then concatenate into a final video.
`/fk-concat`	Download and concatenate all scene videos into a single video with optional TTS narration.
`/fk-create-project`	Create a new Google Flow video project. Ask the user for:
`/fk-creative-mix`	Creative video mixing — combine techniques for cinematic results.
`/fk-dashboard`	Show live GLA status in Claude Code statusline.
`/fk-fix-uuids`	Find and fix any non-UUID media_ids (CAMS... format) across all scenes and entities.
`/fk-gen-chain-videos`	Generate videos with automatic scene chaining (start+end frame transitions).
`/fk-gen-images`	Generate scene images for all scenes in a video.
`/fk-gen-music`	fk-gen-music — Generate Music via Suno
`/fk-gen-narrator`	fk-gen-narrator — Generate Narrator Text + TTS for All Scenes
`/fk-gen-refs`	Generate reference images for all entities in a project.
`/fk-gen-tts-template`	fk-gen-tts-template — Generate Voice Template
`/fk-gen-tts`	fk-gen-tts — Generate TTS Narration
`/fk-gen-videos`	Generate videos for all scenes in a video.
`/fk-insert-scene`	Insert new scene(s) into an existing video chain — for multi-angle shots, cutaways, or close-ups.
`/fk-research`	Fact-check & research real events via web search before scripting documentary content.
`/fk-review-video`	Review AI-generated scene videos for quality using Claude Vision.
`/fk-status`	Show full status dashboard for a project.
`/fk-thumbnail-guide`	YouTube Thumbnail Guide — Hook-Worthy Design Rules
`/fk-thumbnail`	Generate 4 YouTube-optimized thumbnail variants for a project video.
`/fk-youtube-seo`	fk-youtube-seo — Generate YouTube Metadata (SEO-Optimized)
`/fk-youtube-upload`	fk-youtube-upload — Upload Video to YouTube (Shorts + Long-form)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Google Flow Agent — Gemini CLI Instructions

Pre-flight

Critical Rules (MUST follow)

Pipeline Order

Batch API

Skills

FilesExpand file tree

GEMINI.md

Latest commit

History

GEMINI.md

File metadata and controls

Google Flow Agent — Gemini CLI Instructions

Pre-flight

Critical Rules (MUST follow)

Pipeline Order

Batch API

Skills