How to retrieve the distilled model in a manner similar to the OpenAI API interface ?

I used the script `tune run --nnodes 1 --nproc_per_node 2 knowledge_distillation_distributed --config llama3_2/8B_to_1B_KD_lora_distributed` to distill the model and saved the results to `/tmp/torchtune/llama3_2_8B_to_1B/KD_lora_distributed`. How can I run my 1B model in a way similar to the OpenAI API to verify its effectiveness?