Inconsistency for CLS Token in TextTransformer

When the TextTransformer is instantiated in this function

https://github.com/mlfoundations/open_clip/blob/13b01ec788c0c706a4d9ba66e301c8793aae0f0f/src/open_clip/model.py#L209

the keyword argument  `correct_cls_mask` is not set so it is left to the default of `False`

Hence, when the additive attention mask is created in 

https://github.com/mlfoundations/open_clip/blob/13b01ec788c0c706a4d9ba66e301c8793aae0f0f/src/open_clip/transformer.py#L1100-L1103

the mask for the CLS Token is always positioned at the beginning of the sequence.
However, the CLS Token itself is always placed at the end of the sequence:

https://github.com/mlfoundations/open_clip/blob/13b01ec788c0c706a4d9ba66e301c8793aae0f0f/src/open_clip/transformer.py#L1118-L1121

Am I correct, and could we make use of  `correct_cls_mask` to solve this?




	if self.cls_emb is not None:
	cls_valid = valid.new_ones(valid.size(0), 1) # [B, 1]
	# cls mask pos at end if correct or front for incorrect legacy mode in existing CoCa weights
	valid = torch.cat([valid, cls_valid] if self.correct_cls_mask else [cls_valid, valid], 1)

	# Optional class token (always appended ala CoCa)
	if self.cls_emb is not None:
	x = torch.cat([x, _expand_token(self.cls_emb, x.size(0))], 1)
	seq_len += 1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistency for CLS Token in TextTransformer #1113

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Inconsistency for CLS Token in TextTransformer #1113

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions