mflux-debugger #275

filipstrand · 2025-11-11T15:14:49Z

No description provided.

filipstrand · 2025-11-20T10:34:48Z

@anthonywu This PR is obviously huge and I'm not expecting a full review here. However, I was thinking about one thing you added before, which was the prompt file support. That feels like a really natural way to work with this model especially.

Some background in case you haven't played around with this model yet:

The FIBO mode model works with structured JSON prompts that can be quite large, for example in one of our tests we have the following one:

{
  "short_description": "A hyper-detailed, ultra-fluffy owl sitting in the trees at night, looking directly at the camera with wide, adorable, expressive eyes. Its feathers are soft and voluminous, catching the cool moonlight with subtle silver highlights. The owl's gaze is curious and full of charm, giving it a whimsical, storybook-like personality.",
  "objects": [
    {
      "description": "An adorable, fluffy owl with large, expressive eyes and soft, voluminous feathers. Its plumage is a mix of warm browns, grays, and subtle silver highlights from the moonlight.",
      "location": "center",
      "relationship": "The owl is the sole subject, perched comfortably within its environment.",
      "relative_size": "large within frame",
      "shape_and_color": "Round head, large eyes, bulky body, predominantly brown and grey with silver accents.",
      "texture": "Extremely soft, fluffy, and detailed feathers, giving a plush toy-like appearance.",
      "appearance_details": "The eyes are wide, dark, and reflective, conveying a sense of wonder and curiosity. The beak is small and light-colored, almost hidden by the feathers. Subtle silver highlights catch the moonlight on its feathers.",
      "orientation": "upright, facing forward"
    }
  ],
  "background_setting": "A dark, nocturnal forest setting with blurred trees and foliage, illuminated by a soft, cool moonlight. The background is out of focus, emphasizing the owl.",
  "lighting": {
    "conditions": "moonlight",
    "direction": "backlit and side-lit from the left",
    "shadows": "soft, diffused shadows on the right side of the owl and within the background foliage, indicating a single light source."
  },
  "aesthetics": {
    "composition": "centered, portrait composition",
    "color_scheme": "cool blues and silvers from the moonlight contrasting with warm browns and grays of the owl and forest.",
    "mood_atmosphere": "mysterious, enchanting, whimsical, and serene.",
    "aesthetic_score": "very high",
    "preference_score": "very high"
  },
  "photographic_characteristics": {
    "depth_of_field": "shallow",
    "focus": "sharp focus on the owl's face and eyes, with a soft blur in the background.",
    "camera_angle": "eye-level",
    "lens_focal_length": "portrait lens (e.g., 50mm-85mm)"
  },
  "style_medium": "digital illustration",
  "text_render": [],
  "context": "A whimsical character illustration, possibly for a children's book, animated film, or fantasy art collection.",
  "artistic_style": "fantasy, illustrative, detailed"
}

When the user wants to generate an image they currently have 3 options:

a) Fully manual edit: {complex user JSON prompt} ---> {JSON prompt for model}
b) Go through the fine-tuned FIBO VLM (GENERATE): {simple user prompt} ----> {VLM expands to structured JSON} --> {JSON prompt for model}
c) Go through the fine-tuned FIBO VLM for edits (REFINE): {complex JSON + simple user edit prompt} ----> {VLM expands to structured JSON} --> {JSON prompt for model}

(see VLM tests for examples)

I can image the typical workflow being mostly to start with b), but then, once you are semi-happy with the image, do more minor tweaks using a) directly, or c), and iterate. In these cases, I think it is very nice to have the full JSON saved to disk. Maybe for this model, we should always output the JSON prompt that was used?

This was not really a question, but just came to think of this as I'm wrapping up the core implementation and started to think about the UI/UX for this model... If you have other ideas/opinions, let me know :)

filipstrand · 2025-11-20T11:01:11Z

Current design proposal (always have mflux-generate-fibo save prompt.json in generation): It can then later be refined, either manually or "VLM-style", like so:

# Step 1: Generate initial image
mflux-generate-fibo --prompt "dragon" --output dragon.png
# Creates: dragon.png + dragon.prompt.json

# Step 2: Refine the JSON prompt  (manually, or as here, with the VLM-based refine command)
mflux-refine-fibo \
  --prompt-file dragon.prompt.json \
  --instructions "make the dragon blue and add wings" \
  --output dragon_refined.prompt.json

# Step 3: Generate image with refined prompt
mflux-generate-fibo \
  --prompt-file dragon_refined.prompt.json \
  --output dragon_refined.png

In this proposal, mflux-refine-fibo only works with text: (text in -> text out). Of course, we could have it generate a new image, but maybe it is better to have it be more limited in in scope and let the generate command be the only one that actually generates the image...It feels a bit more composable this way

anthonywu · 2025-11-20T15:36:38Z

Lots of great new work. I'll take a look soon.

anthonywu · 2025-11-20T19:12:43Z

The FIBO mode model works with structured JSON prompts

This is exactly what dspy (and pydantic models inside) can help with! Have you digested #264? I think super relevant.

anthonywu · 2025-11-20T19:21:33Z

# Step 1: Generate initial image
mflux-generate-fibo --prompt "dragon" --output dragon.png
# Creates: dragon.png + dragon.prompt.json

# Step 2: Refine the JSON prompt  (manually, or as here, with the VLM-based refine command)
mflux-refine-fibo \
  --prompt-file dragon.prompt.json \
  --instructions "make the dragon blue and add wings" \
  --output dragon_refined.prompt.json

# Step 3: Generate image with refined prompt
mflux-generate-fibo \
  --prompt-file dragon_refined.prompt.json \
  --output dragon_refined.png

This smells like a good fit for a fluent interface in the SDK level or a | interface in the CLI level.

(
    FiboPipeline.from_prompt("dragon")
    .generate("dragon.png")  # Step 1: mflux-generate-fibo --prompt "dragon" --output dragon.png
    .refine(instructions="make the dragon blue and add wings",
            output_prompt="dragon_refined.prompt.json")  # Step 2: mflux-refine-fibo ...
    .generate("dragon_refined.png")  # Step 3: mflux-generate-fibo --prompt-file dragon_refined.prompt.json ...
)

(
    FiboPipeline.from_prompt_file("existing.prompt.json")
    .refine(instructions="make the dragon blue and add wings", output_prompt="dragon_refined.prompt.json")
    .generate("dragon_refined.png")
)

this is going to be faster than a shell based method where each invocation loads the model from disk, it's not prohibitive but feels wrong to load the same weights over and over.

echo "dragon" \
  | fibo-prompt \
  | tee dragon.prompt.json \
  | fibo-refine --instructions "make the dragon blue and add wings" \
  | tee dragon_refined.prompt.json \
  | fibo-render --output dragon_refined.png

I expect that showing these prototyped usages to Gemini 3 or Composer or Codex 5.1 can get you really far.

anthonywu

Looks like a big change that you're able to isolate into sub-modules. So the risk to the pre-existing project appears minimal.

Trust the linter, trust the tests, and you should ship whenever you feel comfortable. No blocking concerns on my part!

anthonywu · 2025-11-20T21:14:12Z

tests/image_generation/test_generate_image_fibo.py

+            max_tokens=4096,
+            stop=["<|im_end|>", "<|end_of_text|>"],
+            task="refine",
+            structured_prompt='{"short_description":"A hyper-detailed, ultra-fluffy owl sitting in the trees at night, looking directly at the camera with wide, adorable, expressive eyes. Its feathers are soft and voluminous, catching the cool moonlight with subtle silver highlights. The owl\'s gaze is curious and full of charm, giving it a whimsical, storybook-like personality.","objects":[{"description":"An adorable, fluffy owl with large, expressive eyes and soft, voluminous feathers. Its plumage is a mix of warm browns, grays, and subtle silver highlights from the moonlight.","location":"center","relationship":"The owl is the sole subject, perched comfortably within its environment.","relative_size":"large within frame","shape_and_color":"Round head, large eyes, bulky body, predominantly brown and grey with silver accents.","texture":"Extremely soft, fluffy, and detailed feathers, giving a plush toy-like appearance.","appearance_details":"The eyes are wide, dark, and reflective, conveying a sense of wonder and curiosity. The beak is small and light-colored, almost hidden by the feathers. Subtle silver highlights catch the moonlight on its feathers.","orientation":"upright, facing forward"}],"background_setting":"A dark, nocturnal forest setting with blurred trees and foliage, illuminated by a soft, cool moonlight. The background is out of focus, emphasizing the owl.","lighting":{"conditions":"moonlight","direction":"backlit and side-lit from the left","shadows":"soft, diffused shadows on the right side of the owl and within the background foliage, indicating a single light source."},"aesthetics":{"composition":"centered, portrait composition","color_scheme":"cool blues and silvers from the moonlight contrasting with warm browns and grays of the owl and forest.","mood_atmosphere":"mysterious, enchanting, whimsical, and serene.","aesthetic_score":"very high","preference_score":"very high"},"photographic_characteristics":{"depth_of_field":"shallow","focus":"sharp focus on the owl\'s face and eyes, with a soft blur in the background.","camera_angle":"eye-level","lens_focal_length":"portrait lens (e.g., 50mm-85mm)"},"style_medium":"digital illustration","text_render":[],"context":"A whimsical character illustration, possibly for a children\'s book, animated film, or fantasy art collection.","artistic_style":"fantasy, illustrative, detailed"}',


style wise I'd put these long prompt structures in a test data file and read it in, so future edit diffs look clean

Agree, that would be cleaner! I will do some slight refactoring of the tests once everything is in place

filipstrand · 2025-11-21T09:55:15Z

This is exactly what dspy (and pydantic models inside) can help with! Have you digested #264? I think super relevant.

Great reminder, I haven't yet but will do it soon!

- feat: add mflux-debugger with rules, dependencies, and 0.14.2dev0 bump (c0ebf242) - clean (203b3fbf)

filipstrand self-assigned this Nov 11, 2025

filipstrand force-pushed the mflux-debugger branch from 0557b44 to 455a2d0 Compare November 11, 2025 15:29

filipstrand changed the title ~~Mflux debugger~~ mflux-debugger Nov 11, 2025

filipstrand force-pushed the mflux-debugger branch 4 times, most recently from 562b9f8 to 5e64c82 Compare November 13, 2025 00:12

filipstrand changed the title ~~mflux-debugger~~ mflux-debugger + Bria FIBO Nov 13, 2025

filipstrand marked this pull request as draft November 14, 2025 10:01

filipstrand force-pushed the mflux-debugger branch 2 times, most recently from 64b2ef2 to 09d6187 Compare November 18, 2025 17:27

anthonywu reviewed Nov 20, 2025

View reviewed changes

filipstrand force-pushed the mflux-debugger branch from fbcd955 to 3c1bd03 Compare November 21, 2025 13:16

filipstrand changed the title ~~mflux-debugger + Bria FIBO~~ mflux-debugger Nov 21, 2025

filipstrand force-pushed the mflux-debugger branch from 3c1bd03 to c03ac60 Compare November 21, 2025 13:28

anthonywu mentioned this pull request Nov 21, 2025

Add Bria FIBO support #279

Merged

filipstrand force-pushed the mflux-debugger branch 3 times, most recently from cf86b32 to b01427d Compare November 28, 2025 11:44

filipstrand mentioned this pull request Nov 28, 2025

Add support for Z-Image #281

Closed

filipstrand force-pushed the mflux-debugger branch 2 times, most recently from 7bc05f2 to 2ac3af2 Compare December 5, 2025 11:30

filipstrand force-pushed the mflux-debugger branch from 2ac3af2 to de8c131 Compare January 2, 2026 03:11

mflux-debugger

b311a8d

- feat: add mflux-debugger with rules, dependencies, and 0.14.2dev0 bump (c0ebf242) - clean (203b3fbf)

filipstrand force-pushed the mflux-debugger branch from ffeadfc to b311a8d Compare January 14, 2026 00:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mflux-debugger #275

mflux-debugger #275

Uh oh!

filipstrand commented Nov 11, 2025 •

edited

Loading

Uh oh!

filipstrand commented Nov 20, 2025 •

edited

Loading

Uh oh!

filipstrand commented Nov 20, 2025 •

edited

Loading

Uh oh!

anthonywu commented Nov 20, 2025

Uh oh!

anthonywu commented Nov 20, 2025

Uh oh!

anthonywu commented Nov 20, 2025

Uh oh!

anthonywu left a comment

Uh oh!

anthonywu Nov 20, 2025 •

edited

Loading

Uh oh!

filipstrand Nov 21, 2025

Uh oh!

filipstrand commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mflux-debugger #275

Are you sure you want to change the base?

mflux-debugger #275

Uh oh!

Conversation

filipstrand commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

filipstrand commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

filipstrand commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anthonywu commented Nov 20, 2025

Uh oh!

anthonywu commented Nov 20, 2025

Uh oh!

anthonywu commented Nov 20, 2025

Uh oh!

anthonywu left a comment

Choose a reason for hiding this comment

Uh oh!

anthonywu Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

filipstrand Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

filipstrand commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

filipstrand commented Nov 11, 2025 •

edited

Loading

filipstrand commented Nov 20, 2025 •

edited

Loading

filipstrand commented Nov 20, 2025 •

edited

Loading

anthonywu Nov 20, 2025 •

edited

Loading