We currently send each response token in a separate chunk. We should consider a better approach, probably using VLLM_V1_OUTPUT_PROC_CHUNK_SIZE.