enable llama model in FT backend #146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

hongboshi1234 wants to merge 1 commit into triton-inference-server:main from hongboshi1234:llama

hongboshi1234 commented Jun 24, 2023

existing FT backend will throw error for llama model.


          enable llama model

5165d79

sfc-gh-zhwang commented Jul 8, 2023

Will this ever work? I didn't see llama defined under: https://github.com/NVIDIA/FasterTransformer/tree/main/src/fastertransformer/triton_backend

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet