python runner.py
-r/--region=[Region]
-m/--model_version=[ModelVersion]
-rpm-request_per_minute=[RequestsPerMin]
-ns/--num_prompts=[PromptSize]
-ns/--num_samples[SamplingSize]
Example:
python runner.py -r=EastUS -m=gpt4t0125 -rpm=2 -ns=1000 -ns=200
Defaults:
Region=EastUS
ModelVersion=gpt4t0125
Unspecified num_samples and num_prompts will be uniform distribution from 100 to 5000 tokens
- Azure OpenAI
- OpenAI
- AWS
- Google
Current supported Regions:
| Region | Onboarded Date |
|---|---|
| EastUS | 4/16/2024 |
| NorthCentralUS | 4/17/2024 |
| WestUS | 4/17/2024 |
| USSouth | 4/18/2024 |
| SwedenCentral | 4/19/2024 |
| AWS | 7/1/2024 |
| 7/21/2024 |
- Add the new region here
- Push the changes to docker.io/suriyakalivardhan/llm-runners:v15 (ToDo: Automate this)
- Deploy AzureOpenAI account in the regions (ToDo: Add script)
- Deploy the runners
az container create -f acidep.yml
- Owners: Sid, Pankaj, Halit and Suriya
- For access to dashboard - please provide your ObjectId to one of the owners.
| Region | ModelVersion |
|---|---|
| EastUS | gpt4t0125,gpt35t0613,textembeddings3large,textembeddings3small,claude3sonnet20240229v1,claude3haiku20240307v1,claude35sonnet20240620v1,gemini15flash,gemini15pro,gpt4omini |
| FranceCentral | gpt4t1106,gpt35t0613,textembeddings3large |
| IndiaSouth | gpt4t1106 |
| JapanEast | gpt35t0613 |
| NorthCentralUS | gpt4t0125,gpt35t0613 |
| SwedenCentral | gpt40613,gpt4t1106,textembeddings3large,gpt4t0409 |
| UKSouth | gpt40613,gpt4t1106,gpt35t0613,gemini15flash,gemini15pro |
| WestUS | gpt4t1106,claude3opus20240229v1 |
| SouthCentralUS | gpt4o0513 |
| EastUS2 | gpt4t0409 |