Skip to content

Support attention masking to prevent attention across EOT tokens #206

Closed
@achalddave

Description

@achalddave

By default, openlm performs causal attention across all tokens in a sequence, even if this sequence contains multiple documents separated by EOT. This might be related to #194. I think we can start with supporting it just for xformers using BlockDiagonalCausalMask https://facebookresearch.github.io/xformers/components/ops.html#xformers.ops.fmha.attn_bias.BlockDiagonalCausalMask
cc @sagadre

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions