Feature/generation model by qchapp · Pull Request #41 · EPFLiGHT/MMIRAGE

qchapp · 2026-04-26T21:30:46Z

This pull request adds support for text-to-image generation using Diffusers models in the MMIRAGE library. It introduces a new image_gen processor type, complete with configuration, output variable definition, and documentation. The changes also include a sample configuration file, dependency management, and updates to the processor registry to enable seamless integration of image generation workflows.

Image generation support:

Added a new image_gen processor, including its configuration (DiffusersImageGenConfig), output variable definition (ImageGenOutputVar), and registration in the processor registry. This enables text-to-image generation using Diffusers pipelines, with support for various runtime and output options. [1] [2] [3]
Updated the processor registry and config utilities to lazily import the new image generation processor, ensuring efficient resource usage and modularity. [1] [2]

Configuration and documentation:

Added a sample YAML configuration (configs/config_mock_image_gen.yaml) demonstrating how to use the new image_gen processor for text-to-image generation, including parallel inference and output customization.
Expanded the README.md to document support for image generation models, provide configuration examples, and explain the new processor type and its parameters. [1] [2] [3]

Dependency management:

Added an optional image_gen dependency group to pyproject.toml for installing required libraries (diffusers, accelerate, safetensors).

Core pipeline updates:

Updated the Mapper class to accept and forward the shard_id parameter to processors, ensuring correct sharding behavior for image generation tasks. [1] [2]

Copilot

Pull request overview

Adds a new image_gen processor to MMIRAGE to enable text-to-image generation via Diffusers, plus the supporting config/docs and pipeline wiring needed to run it in shard processing.

Changes:

Introduces image_gen processor implementation + config/output-var types and registers it for lazy loading.
Adds optional dependency group ([image_gen]) and sample config/data for running an image generation pipeline.
Updates shard processing + mapper to support sharding context (shard_id) and to cast generated image-path columns to HF Image.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
tests/mock_data_image_gen/data.jsonl	Adds mock prompt data for image generation examples/tests.
src/mmirage/shard_process.py	Forwards `shard_id` into the mapper and casts image-path outputs to HF `Image`.
src/mmirage/core/process/processors/image_gen/image_gen_processor.py	Implements Diffusers-backed image generation processor with path/PIL output modes.
src/mmirage/core/process/processors/image_gen/config.py	Adds `DiffusersImageGenConfig` and `ImageGenOutputVar` with template validation.
src/mmirage/core/process/processors/image_gen/init.py	Creates the new processor module package.
src/mmirage/core/process/mapper.py	Extends mapper to accept/forward `shard_id` into processors.
src/mmirage/core/process/base.py	Registers `image_gen` for lazy processor import.
src/mmirage/config/utils.py	Ensures `image_gen` config types are registered at config-load time.
pyproject.toml	Adds optional dependency group for Diffusers-based image generation.
configs/config_mock_image_gen.yaml	Provides a runnable example config for the new processor.
README.md	Documents image generation support, config example, and optional install.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…sor.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…T/mmirage into feature/generation-model

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

fabnemEPFL · 2026-04-27T14:13:59Z

            logger.info(f"✅ Successfully loaded processor of type {config.type}")

-            self.processors[config.type] = processor_cls(config)
+            self.processors[config.type] = processor_cls(config, shard_id=shard_id)


the shard_id is currently ignored by LLMProcessor, maybe make it use it as well? it seems to be used only for computing the render filename

fabnemEPFL · 2026-04-27T14:34:22Z

+        for col in cols:
+            if col in ds.column_names:
+                ds = ds.map(
+                    _normalise_col, batched=True, fn_kwargs={"col": col}, desc=f"Normalising {col}",
+                    load_from_cache_file=False,
+                )
+                ds = ds.cast_column(col, HFImage())


could be a helper function that is also called for each split if ds is a DatasetDict -> avoids code duplication

fabnemEPFL · 2026-04-27T14:43:43Z

+    default_sampling_params: Dict[str, Any] = field(default_factory=dict)
+    parallel_inference: bool = True
+    parallel_chunk_size: Optional[int] = 4
+    output_dir: str = ".mmirage/generated_images"


makes a new folder .mmirage at the root of the local repository?

fabnemEPFL · 2026-04-27T14:45:10Z

+
+    def __post_init__(self) -> None:
+        """Validate optional parallelism settings."""
+        if self.parallel_chunk_size is not None and self.parallel_chunk_size <= 0:


it sounds better to raise an error here, it should not be silently interpreted as None when a value is nonpositive

fabnemEPFL · 2026-04-27T14:48:37Z

+
+    def get_output_dir(self) -> str:
+        """Get normalized absolute output directory path."""
+        return os.path.abspath(os.path.expanduser(self.output_dir))


why not in the cache folder?

like DEFAULT_STATE_DIR = "~/.cache/MMIRAGE/state_dir" in src/mmore/config/loading.py

fabnemEPFL · 2026-04-27T16:27:11Z

+                os.unlink(tmp_path)
+            except OSError:
+                pass
+            raise


maybe have a more specific error

fabnemEPFL · 2026-04-27T16:35:49Z

+        updated: List[VariableEnvironment] = []
+        for local_index, (env, image) in enumerate(zip(chunk, images)):
+            sample_index = start_index + local_index
+            if output_var.output_mode == "pil":


having an enum for the output mode would make sense...

fabnemEPFL · 2026-04-27T16:38:29Z

+                if negative_prompt is not None:
+                    call_kwargs["negative_prompt"] = negative_prompt
+                output = self._pipeline(**call_kwargs)
+                image = output.images[0]


is it guaranteed to work / that there is no more than 1 image?

fabnemEPFL · 2026-04-27T16:44:14Z

+
+    def shutdown(self) -> None:
+        """Release pipeline references."""
+        self._pipeline = None


is it really enough to shutdown?

Co-authored-by: fabnemEPFL <117652591+fabnemEPFL@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 12 comments.

+    image_path_var_names = {
+        v.name
+        for v in output_vars
+        if getattr(v, "output_mode", None) == "path"


+            mapper.shutdown()
+            logger.info("Processors shut down.")


+
+    def image_output_mode_hook(value: Any) -> ImageOutputMode:
+      if isinstance(value, ImageOutputMode):
+          return value
+      return ImageOutputMode(value)


+processing_params:
+  inputs:
+    - name: text
+      key: caption


+processing_params:
+  inputs:
+    - name: text
+      key: caption


+    def _normalise_col(batch: Dict[str, Any], col: str) -> Dict[str, Any]:
+        return {col: [v if v else None for v in batch[col]]}


+        proc = subprocess.Popen(
+            cmd,
+            stdout=subprocess.PIPE,
+            stderr=subprocess.STDOUT,
+            env=os.environ.copy(),


+            placement_device, generator_device, use_device_map = self._resolve_auto_device(args)
+
+            if use_device_map:
+                device_map = getattr(args, "device_map", None) or "balanced"


+using Diffusers pipelines. It can emit either saved image paths or in-memory
+PIL images.


+            ``"auto"`` distributes across all available GPUs when more than
+            one is present (via ``device_map='auto'``), or falls back to CPU.


qchapp added 8 commits March 27, 2026 19:07

first version for generation model

09d96bf

changed model name

9bc4ada

added parallel processing and tests

48a3e39

new changes to filename + test config

e374be4

trying something new

fb6454b

small fix

e3b1a82

should be ready for PR

c3f85ae

ready for PR and tested

b840515

qchapp self-assigned this Apr 26, 2026

Copilot AI review requested due to automatic review settings April 26, 2026 21:30

qchapp linked an issue Apr 26, 2026 that may be closed by this pull request

Use image generation models #24

Open

qchapp temporarily deployed to docker April 26, 2026 21:30 — with GitHub Actions Inactive

Copilot started reviewing on behalf of qchapp April 26, 2026 21:31 View session

Copilot AI reviewed Apr 26, 2026

View reviewed changes

Comment thread src/mmirage/core/process/processors/image_gen/image_gen_processor.py Outdated

Comment thread src/mmirage/core/process/processors/image_gen/image_gen_processor.py Outdated

Comment thread src/mmirage/core/process/mapper.py

copilot propositions

3368040

qchapp temporarily deployed to docker April 27, 2026 07:11 — with GitHub Actions Inactive

qchapp requested a review from Copilot April 27, 2026 07:12

Copilot started reviewing on behalf of qchapp April 27, 2026 07:12 View session

Copilot AI reviewed Apr 27, 2026

View reviewed changes

Update src/mmirage/core/process/processors/image_gen/image_gen_proces…

534d8c6

…sor.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

qchapp temporarily deployed to docker April 27, 2026 07:29 — with GitHub Actions Inactive

Update src/mmirage/shard_process.py

89a81a3

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

qchapp temporarily deployed to docker April 27, 2026 07:30 — with GitHub Actions Inactive

qchapp added 2 commits April 27, 2026 09:36

implemented copilot propositons

c46a01d

Merge branch 'feature/generation-model' of https://github.com/EPFLiGH…

763f756

…T/mmirage into feature/generation-model

qchapp temporarily deployed to docker April 27, 2026 07:36 — with GitHub Actions Inactive

qchapp requested a review from Copilot April 27, 2026 07:36

Copilot started reviewing on behalf of qchapp April 27, 2026 07:37 View session

Copilot AI reviewed Apr 27, 2026

View reviewed changes

Comment thread src/mmirage/core/process/base.py Outdated

Comment thread src/mmirage/shard_process.py Outdated

again

544ca0c

qchapp temporarily deployed to docker April 27, 2026 07:49 — with GitHub Actions Inactive

qchapp requested a review from fabnemEPFL April 27, 2026 07:50

fabnemEPFL requested changes Apr 27, 2026

View reviewed changes

Update configs/config_mock_image_gen.yaml

8dfa161

Co-authored-by: fabnemEPFL <117652591+fabnemEPFL@users.noreply.github.com>

qchapp temporarily deployed to docker April 27, 2026 18:01 — with GitHub Actions Inactive

new backends to test on cluster

e0481ae

qchapp temporarily deployed to docker May 7, 2026 11:28 — with GitHub Actions Inactive

small change in validation of field

3205ece

qchapp temporarily deployed to docker May 7, 2026 11:47 — with GitHub Actions Inactive

Fix YAML enum parsing for image output mode

691f7d7

qchapp temporarily deployed to docker May 7, 2026 11:59 — with GitHub Actions Inactive

fabnemEPFL requested a review from Copilot May 12, 2026 17:06

Copilot started reviewing on behalf of fabnemEPFL May 12, 2026 17:07 View session

Copilot AI reviewed May 12, 2026

View reviewed changes

		def _normalise_col(batch: Dict[str, Any], col: str) -> Dict[str, Any]:
		return {col: [v if v else None for v in batch[col]]}

		using Diffusers pipelines. It can emit either saved image paths or in-memory
		PIL images.

		``"auto"`` distributes across all available GPUs when more than
		one is present (via ``device_map='auto'``), or falls back to CPU.

Conversation

qchapp commented Apr 26, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants