-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
ASTRA/extract_ref/extracting_activations_llava_ref.py
Lines 43 to 55 in 54df103
| inputs = processor(text=[query], images=[reference_img], return_tensors="pt").to("cuda", torch.float16) | |
| index_input_ids = inputs["input_ids"].shape[1] | |
| generate_ids = model.generate(**inputs, do_sample=True, max_length=512, temperature=0.2, top_p=0.9,) | |
| response = processor.decode(generate_ids[0, inputs["input_ids"].shape[1]:], skip_special_tokens=False) | |
| inputs = processor(text=[query+response], images=[reference_img], return_tensors="pt").to("cuda", torch.float16) | |
| output = model(**inputs, output_hidden_states=True) | |
| img_activations = {} | |
| for layer in layers: | |
| hidden_states = output.hidden_states[layer].detach().cpu() | |
| img_activations[layer] = torch.mean(hidden_states[0, index_input_ids+24*24:], dim=0) |
The length of the image embedding should already be included in the index input_ids, so there is no need to add 24 * 24 when obtaining the activation value in the future
Metadata
Metadata
Assignees
Labels
No labels