I am using the [BentoVLLM](https://github.com/bentoml/BentoVLLM). How do I configure the [service](https://github.com/bentoml/BentoVLLM/blob/main/llama3-8b-instruct/service.py) file to use the `replica` feature? Do I need to use the Ray engine for this? I'm not very familiar with these configurations.
I am using the BentoVLLM. How do I configure the service file to use the
replicafeature?Do I need to use the Ray engine for this? I'm not very familiar with these configurations.