Skip to content

Latest commit

 

History

History
105 lines (89 loc) · 7.78 KB

File metadata and controls

105 lines (89 loc) · 7.78 KB

Google Flow Agent — Gemini CLI Instructions

Base URL: http://127.0.0.1:8100

Pre-flight

Before ANY workflow:

curl -s http://127.0.0.1:8100/health
# Must return: {"extension_connected": true}

Critical Rules (MUST follow)

  1. Media ID is always UUID — format xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx. Never use CAMS... / base64 strings.
  2. Scene prompts = ACTION only — never describe character appearance. Reference images handle visual consistency via imageInputs.
  3. All reference images must exist before scene images — verify every entity has media_id before generating scene images.
  4. No throwaway scripts — NEVER write Python, shell, or any script file to loop over API requests. Use POST /api/requests/batch to submit all requests at once, then poll GET /api/requests/batch-status. The server throttles automatically.
  5. Locations use landscape, characters use portrait — reference image orientation depends on entity type.
  6. UUID extraction — if a response gives CAMS... instead of UUID, extract UUID from the fifeUrl in the response URL: /image/{UUID}?....
  7. Cascade on regen — regenerating an image auto-clears downstream video + upscale.
  8. REGENERATE vs GENERATEGENERATE_* skips if already COMPLETED. REGENERATE_* always runs (clears + regenerates).
  9. Image Material required — every project needs a material field (e.g. realistic, 3d_pixar, anime). List available: GET /api/materials.
  10. Server handles throttling — worker enforces max 5 concurrent requests + 10s cooldown. Submit ALL requests via /batch; do NOT manually stagger or loop.
  11. Video prompts use sub-clip timing — structure 8s video as time segments: 0-3s: [action]. 3-6s: [action]. 6-8s: [action].
  12. Character dialogue in sub-clips — embed speech in quotes: Luna says "Goodnight." Max 10-15 words per character per 2-3s segment.
  13. Scenes are mutable — use PATCH /api/scenes/{sid} to update prompt, video_prompt, narrator_text, character_names after creation. Don't delete and recreate — patch instead.
  14. Fact-check before scripting — ALWAYS research events via web search before writing project stories, scene prompts, or narrator text. Facts (events, dates, names, operations, outcomes) MUST match real sources. Editorial opinion and analysis are allowed but must be framed as such. Never invent events, operation names, or statistics.
  15. Real-people bypass — when characters are based on real famous people (politicians, celebrities, military leaders), NEVER use their real name as entity name or anywhere in description, image_prompt, prompt, or video_prompt. AI image generators reject known public figures. Instead: (a) use a role-based alias as entity name (e.g. "Tổng Tư Lệnh" not "Trump", "Thủ Tướng Sắt" not "Netanyahu"), (b) describe physical appearance only — distinctive hair, face shape, build, clothing style — without naming who it is, (c) narrator_text may use real titles/roles for storytelling but real names never flow into image/video generation. Keep a real_reference mapping in the project plan file (.omc/research/) for internal tracking.
  16. Review before upscale — ALWAYS run /fk-review-video (light mode) after video generation, before upscaling. Scenes scoring < 7.5 get video_prompt updated from review errors, then regen video. Max 2 review-regen cycles.

Pipeline Order

0. Research          /fk-research "topic" (fact-check via web search, save to .omc/research/)
1. Health check      GET  /health → extension_connected: true
2. Create project    POST /api/projects (with entities + material, story from research)
3. Create video      POST /api/videos
4. Create scenes     POST /api/scenes (with character_names, chain_type)
5. Gen ref images    POST /api/requests/batch → poll /batch-status?project_id=<PID>
                     Wait for done=true, verify all entities have media_id
6. Gen scene images  POST /api/requests/batch → poll /batch-status?video_id=<VID>
                     Wait for done=true, verify image_media_id = UUID
7. Gen videos        POST /api/requests/batch → poll /batch-status?video_id=<VID>
                     Wait for done=true (videos take 2-5 min each)
7.5 Review videos    POST /api/videos/{vid}/review?mode=light (Claude Vision quality check)
                     Pass: score >= 7.5 | Fail: update video_prompt → regen → re-review (max 2 cycles)
8. (Optional) 4K     POST /api/requests/batch (TIER_TWO only)
9. (Optional) TTS    Create voice template → POST /api/videos/{vid}/narrate
10. Concat           ffmpeg normalize + concat

Batch API

Submit N requests at once (server throttles automatically — max 5 concurrent, 10s cooldown):

curl -X POST http://127.0.0.1:8100/api/requests/batch \
  -H "Content-Type: application/json" \
  -d '{"requests": [{"type": "...", "scene_id": "...", "project_id": "...", "video_id": "...", "orientation": "VERTICAL"}, ...]}'

Poll aggregate status:

curl -s "http://127.0.0.1:8100/api/requests/batch-status?video_id=<VID>&type=GENERATE_IMAGE"
# Returns: {"total": 40, "pending": 30, "processing": 5, "completed": 5, "failed": 0, "done": false}
# When "done": true → all requests have left the queue (completed or failed)
# When "all_succeeded": true → every request completed successfully

For full API reference, workflow recipes, and video prompt guidelines, see CLAUDE.md.

Skills

This project has reusable skills in skills/. When the user says /fk:<name>, read skills/fk:<name>.md and follow the instructions inside.

Skill Purpose
/fk-add-material fk-add-material — Image Material System
/fk-brand-logo fk-brand-logo — Apply Channel Brand Logo to Video & Thumbnails
/fk-camera-guide Camera Guide — Cinematic Video Prompts
/fk-concat-fit-narrator Trim each scene video to fit its TTS narrator duration, then concatenate into a final video.
/fk-concat Download and concatenate all scene videos into a single video with optional TTS narration.
/fk-create-project Create a new Google Flow video project. Ask the user for:
/fk-creative-mix Creative video mixing — combine techniques for cinematic results.
/fk-dashboard Show live GLA status in Claude Code statusline.
/fk-fix-uuids Find and fix any non-UUID media_ids (CAMS... format) across all scenes and entities.
/fk-gen-chain-videos Generate videos with automatic scene chaining (start+end frame transitions).
/fk-gen-images Generate scene images for all scenes in a video.
/fk-gen-music fk-gen-music — Generate Music via Suno
/fk-gen-narrator fk-gen-narrator — Generate Narrator Text + TTS for All Scenes
/fk-gen-refs Generate reference images for all entities in a project.
/fk-gen-tts-template fk-gen-tts-template — Generate Voice Template
/fk-gen-tts fk-gen-tts — Generate TTS Narration
/fk-gen-videos Generate videos for all scenes in a video.
/fk-insert-scene Insert new scene(s) into an existing video chain — for multi-angle shots, cutaways, or close-ups.
/fk-research Fact-check & research real events via web search before scripting documentary content.
/fk-review-video Review AI-generated scene videos for quality using Claude Vision.
/fk-status Show full status dashboard for a project.
/fk-thumbnail-guide YouTube Thumbnail Guide — Hook-Worthy Design Rules
/fk-thumbnail Generate 4 YouTube-optimized thumbnail variants for a project video.
/fk-youtube-seo fk-youtube-seo — Generate YouTube Metadata (SEO-Optimized)
/fk-youtube-upload fk-youtube-upload — Upload Video to YouTube (Shorts + Long-form)