AMAZING WORK! 4d mask support.

First of all, congrats on the amazing work!, I myself have been working on something like this for the last week but seeing this is such a relief, it'll just confirm that this idea works and there's even an implementation ready.

Hugging face recently merged this into their master. https://huggingface.co/blog/poedator/4d-masks

With that, there should be no need for the custom model and it should just work. However, the input attention masks would need to be 4d, and yours are 2d. I'm sure mathematically is the same, I just need to understand how to build this.

[here](https://github.com/poedator/transformers/blob/441de62f49120f15bc453b43399a7e5c9417dcb0/tests/test_modeling_utils.py#L2123) is an example of how to use 4d masks in hf.

the way you call the model is quite similar as well:

```
    packed_outputs = custom_model(
        input_ids=packed_tokens.to(device),
        attention_mask=independent_mask.to(device),
        position_ids=restart_positions.to(device),
        return_dict=True,
        output_hidden_states=True,
    )
```

Do you have any pointers on how could I reuse your data processor directly with huggingface 4d masks so I don't need a custom model and can train any model that supports this API in hf?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMAZING WORK! 4d mask support. #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

AMAZING WORK! 4d mask support. #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions