Skip to content

Failed test case test_model_generate_images for janus model #44792

@kaixuanliu

Description

@kaixuanliu

System Info

  • transformers version: 5.3.0.dev0
  • Platform: Linux-5.4.292-1.el8.elrepo.x86_64-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • Huggingface_hub version: 1.7.1
  • Safetensors version: 0.5.3
  • Accelerate version: 1.12.0
  • Accelerate config: not found
  • DeepSpeed version: not installed
  • PyTorch version (accelerator?): 2.10.0+cu128 (CUDA)
  • Using distributed or parallel set-up in script?:
  • Using GPU in script?:
  • GPU type: NVIDIA A100 80GB PCIe

Who can help?

multimodal models: @zucchini-nlp

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

git clone https://github.com/huggingface/transformers.git
cd transformers
pip install -e .
export RUN_SLOW=1
pytest -rA tests/models/janus/test_modeling_janus.py::JanusIntegrationTest::test_model_generate_images

Expected behavior

test case pass. Although I added patch

--- a/src/transformers/models/janus/modeling_janus.py
+++ b/src/transformers/models/janus/modeling_janus.py
@@ -1285,7 +1285,7 @@ class JanusForConditionalGeneration(JanusPreTrainedModel, GenerationMixin):
         input_ids, model_kwargs = self._expand_inputs_for_generation(
             input_ids=input_ids,
             attention_mask=attention_mask,
-            expand_size=generation_config.num_return_sequences,
+            expand_size=generation_config.num_return_sequences or 1,
             **model_kwargs,
         )

@@ -1315,7 +1315,7 @@ class JanusForConditionalGeneration(JanusPreTrainedModel, GenerationMixin):
                 # batch_size should account for both conditional/unconditional input; hence multiplied by 2.
                 batch_size=batch_size * 2,
                 # we should have at least a cache len of seq_len + num_image_tokens.
-                max_cache_len=max(generation_config.max_length, num_image_tokens + seq_len),
+                max_cache_len=max(generation_config.max_length or 0, num_image_tokens + seq_len),
                 model_kwargs=model_kwargs,
             )

it still gets failed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions