This quick start recipe provides step-by-step instructions for running the Llama 3.1 Instruct model using vLLM. The recipe is intended for developers and practitioners seeking high-throughput or low-latency inference on the targeted accelerated stack.