Hi, the image works well on the GH200 system, thank you.
Do you mind sharing more details about how to build the wheels (flash-atten, etc ) on aarch64 step by step? The current vLLM version here is 0.7, I want a more updated vLLM, how to build such a container from scratch?
Thank you.
Hi, the image works well on the GH200 system, thank you.
Do you mind sharing more details about how to build the wheels (flash-atten, etc ) on aarch64 step by step? The current vLLM version here is 0.7, I want a more updated vLLM, how to build such a container from scratch?
Thank you.