-
-
Notifications
You must be signed in to change notification settings - Fork 8
Add two flash-attn extensions as multi-outputs #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add two flash-attn extensions as multi-outputs #19
Conversation
This is not quite working yet, but want to make sure I'm on the right track. cc @jakirkham |
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rongou, we'll need to give you access to the Openstack server CI. Could you open a PR at https://github.com/Quansight/open-gpu-server/pulls to add your GitHub username to the access/conda-forge-users.json
file? See also step 2 of https://conda-forge.org/docs/maintainer/knowledge_base/#packages-that-require-a-gpu-or-long-running-builds for more info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Rong! 🙏
Tried to put some rough initial thoughts together below. Hopefully that helps
Happy to discuss further as needed 🙂
Co-authored-by: jakirkham <[email protected]> Co-authored-by: Wei Ji <[email protected]>
Please make sure to add this to the extra:
feedstock-name: flash-attn
... Edit: To change |
…nda-forge-pinning 2024.10.21.14.45.36
Hi! This is the friendly automated conda-forge-linting service. I wanted to let you know that I linted all conda-recipes in your PR ( Here's what I've got... For recipe/meta.yaml:
|
Ok, this is more or less structured as we've discussed, but I'm getting some errors when it tries to package the extensions, any ideas?
|
Thanks Rong! 🙏 Think there is more to do on this point: #19 (comment) Would start by copying that into the The other outputs like need |
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
The linter didn't like it. |
Ah that's because it should be ref: https://conda-forge.org/docs/maintainer/adding_pkgs/#feedstock-name |
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( For recipe/meta.yaml:
|
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
@weiji14 @carterbox @jakirkham I think this is ready. How do I get |
- cuda-cudart-dev # [(cuda_compiler_version or "").startswith("12")] | ||
- libcublas-dev # [(cuda_compiler_version or "").startswith("12")] | ||
- libcurand-dev # [(cuda_compiler_version or "").startswith("12")] | ||
- libcusolver-dev # [(cuda_compiler_version or "").startswith("12")] | ||
- libcusparse-dev # [(cuda_compiler_version or "").startswith("12")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that all of these CUDA packages were here before to satisfy PyTorch's header requirements. The only new one is libcurand-dev
. Perhaps this comes up as some of the new extensions use other bits from PyTorch that were not used before
recipe/setup.py
Outdated
}, | ||
extra_link_args = ["-Wl,--strip-all"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
}, | |
extra_link_args = ["-Wl,--strip-all"], | |
}, | |
libraries=[ | |
'cublas', | |
'cublasLt', | |
], | |
extra_link_args = ["-Wl,--strip-all"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also needs cuRAND
}, | |
extra_link_args = ["-Wl,--strip-all"], | |
}, | |
libraries=[ | |
'cublas', | |
'cublasLt', | |
'curand', | |
], | |
extra_link_args = ["-Wl,--strip-all"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the original setup.py
doesn't include these libraries?
https://github.com/Dao-AILab/flash-attention/blob/main/csrc/layer_norm/setup.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right Daniel is stating they should be based on usage found internally, which also makes sense to me
We can also propose they include this change upstream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't these libraries get resolved by libcudart
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry not following
The #include
s Daniel and I reference come from cuBLAS and cuRAND. Meaning the symbols used also come from those libraries
Likely we have gotten lucky as import torch
causes the loader to find these libraries first and thus satisfy the symbols by the time these extensions use them. However we shouldn't rely on this for at least three reasons:
- Loading order could change
- If PyTorch changes its dependencies, we won't get them
- These packages need to express their version constraint on these libraries so they are correctly satisfied at install time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm looks like these libraries are explicitly loaded by pytorch, e.g. https://github.com/pytorch/pytorch/blob/6734cb7bf2c1763118dcc430cee6110a88f8f849/torch/__init__.py#L313, since these packages are all pytorch CUDAExtension
s, perhaps they should rely on pytorch to load them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surprisingly, the linker seems to think that none of the curand symbols are needed to be loaded dynamically. Perhaps, this package uses header stuff that can be inlined? I have added the cublas, cudart, and python libraries to the linking as needed.
I'm currently building with |
Waiting on conda-forge/admin-requests#1158 The CUDA 12 builds complete, but the CUDA 11 builds need more time. |
…nda-forge-pinning 2024.11.11.08.59.26
Testing to see if 18 hours is enought for CUDA 11 builds. |
4f13ea7
to
576a572
Compare
576a572
to
210cead
Compare
…nda-forge-pinning 2024.11.17.06.32.00
Hi! This is the friendly conda-forge automerge bot! I considered the following status checks when analyzing this PR:
Thus the PR was passing and merged! Have a great day! |
Woohoo! 🥳 Thanks everyone! 🙏 Glad to see this one in 😄 |
Checklist
0
(if the version changed)conda-smithy
(Use the phrase@conda-forge-admin, please rerender
in a comment in this PR for automated rerendering)Fixes #18