How to deploy multiple model in a node with multople GPUs

### Description

```shell
Suppose I have 5 GPT models with each TP=2 and I want to deploy them in a machine with 8 GPUs.  Is it possible? If so, how to control the GPU allocation? I tried to set CUDA_VISIBLE_DEVICES when launch the Triton server does not work.
```


### Reproduced Steps

```shell
Tried CUDA_VISIBLE_DEVICES
```