-
Notifications
You must be signed in to change notification settings - Fork 31.4k
Open
Labels
Description
System Info
transformersversion: 4.57.3- Platform: Linux-6.6.105+-x86_64-with-glibc2.35
- Python version: 3.12.12
- Huggingface_hub version: 0.36.0
- Safetensors version: 0.7.0
- Accelerate version: 1.12.0
- Accelerate config: not found
- DeepSpeed version: not installed
- PyTorch version (accelerator?): 2.9.0+cu126 (CUDA)
- Tensorflow version (GPU?): 2.19.0 (True)
- Flax version (CPU?/GPU?/TPU?): 0.10.7 (gpu)
- Jax version: 0.7.2
- JaxLib version: 0.7.2
- Using distributed or parallel set-up in script?: No
- Using GPU in script?: No
- GPU type: Tesla T4
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Problem
Even though I pass the output_hidden_states=True to the vision and text model through the config, it seems that the SiglipModel class does not successfully cascade this config to the vision and text model.
Reproduction Code
import torch
from PIL import Image
import requests
from transformers import AutoProcessor, AutoModel, AutoConfig
model_path = "google/siglip-base-patch16-224"
config = AutoConfig.from_pretrained(model_path)
config.output_hidden_states = True
config.vision_config.output_hidden_states = True
config.text_config.output_hidden_states = True
model = AutoModel.from_pretrained(
"google/siglip-base-patch16-224",
config=config
)
processor = AutoProcessor.from_pretrained("google/siglip-base-patch16-224")
images = [Image.open(requests.get("http://images.cocodataset.org/val2017/000000039769.jpg", stream=True).raw)]
texts = ["a photo of 2 cats"]
inputs = processor(text=texts, images=images, padding="max_length", return_tensors="pt")
with torch.no_grad():
outputs = model.forward(**inputs)
print(outputs.text_model_output.keys()) # odict_keys(['last_hidden_state', 'pooler_output'])Expected behavior
It should have returned hidden_states also as i already stated in the config file when i am using SiglipModel class.