Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Add Developer Guide: How to Hack Any Transformers Model #33979

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

MagnusS0
Copy link

@MagnusS0 MagnusS0 commented Oct 5, 2024

What does this PR do?

This PR adds a new developer guide titled "How to Hack Any Transformers Model" to the docs. The guide shows how to modify existing models, using the Segment Anything Model (SAM) as an example. It also encourages community contributions by inviting others to share their own hacks.

Changes

  • Added a new developer guide at docs/source/en/how_to_hack_models.md
  • Updated docs/source/en/_toctree.yml to include the new guide in the Developer Guides section.

Fixes #33928

Before submitting

Who can review?

Tagging @ArthurZucker as we discussed this in issue #33928 and the code example was provided by him. Let me know if this is what you had in mind 🤗

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I think we can have a separate doc, with sam being the example!
This could be in the same section as https://huggingface.co/docs/transformers/fast_tokenizers (developer guides!)

@MagnusS0
Copy link
Author

MagnusS0 commented Oct 5, 2024

Ahh, yeah makes sense, then I can extend the example and show how it also can be used e.g. with PEFT! WDYT?

@ArthurZucker
Copy link
Collaborator

Of course! And if people have example of adding SDPA for example (not here for SAM) or good hacking, will go there! let's call for contribution probably! 🤗

@MagnusS0 MagnusS0 changed the title docs: add example for separating q, k, v projections in SAM attention [Docs] Add Developer Guide: How to Hack Any Transformers Model Oct 5, 2024
@MagnusS0
Copy link
Author

MagnusS0 commented Oct 5, 2024

Updated the PR: It now adds a new developer guide titled "How to Hack Any Transformers Model" 🚀
I've also updated the PR description and title accordingly.

Let me know your thoughts! Do you think we should make the guide more general regarding model hacking? Or do you (or anyone else) have any extra examples to add?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Separate q_proj, k_proj, and v_proj for Attention Layers in SAM
2 participants