File tree Expand file tree Collapse file tree 1 file changed +2
-2
lines changed Expand file tree Collapse file tree 1 file changed +2
-2
lines changed Original file line number Diff line number Diff line change @@ -49,7 +49,7 @@ We provide a docker file, which bases on Triton image `nvcr.io/nvidia/tritonserv
4949
5050``` bash
5151mkdir workspace && cd workspace
52- git clone https://gitlab-master.nvidia. com/liweim/transformer_backend .git
52+ git clone https://github. com/triton-inference-server/fastertransformer_backend .git
5353nvidia-docker build --tag ft_backend --file transformer_backend/Dockerfile .
5454nvidia-docker run --gpus=all -it --rm --volume $HOME :$HOME --volume $PWD :$PWD -w $PWD --name ft-work ft_backend
5555cd workspace
@@ -120,4 +120,4 @@ The model configuration for Triton server is put in `all_models/transformer/conf
120120- vocab_size: size of vocabulary
121121- decoder_layers: number of transformer layers
122122- batch_size: max supported batch size
123- - is_fuse_QKV: fusing QKV in one matrix multiplication or not. It also depends on the weights of QKV.
123+ - is_fuse_QKV: fusing QKV in one matrix multiplication or not. It also depends on the weights of QKV.
You can’t perform that action at this time.
0 commit comments