Replies: 2 comments 7 replies
-
You'll need to use two limiters; one for your tokens-per-minute limit and the other for the requests-per-minute. I'm assuming you can know the number of tokens included in a API request before you send it. Then first acquire enough capacity for the token count, followed by the requests limiter: # shared across tasks
tpm_limit = AsyncLimiter(TOKENS_PER_MINUTE)
rpm_limit = AsyncLimiter(REQUESTS_PER_MINUTE)
# for any task making an OpenAI request
# token_count is the number of tokens in this specific request
await tpm_limit.acquire(token_count)
async with rpm_limit:
# this block is rate limited The |
Beta Was this translation helpful? Give feedback.
4 replies
-
I wrote my own below: intitialize with:
and use with:
|
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am using an API key that has multiple rate limits associated (#requests_per_minute and #total_length_of_requests_per_minute). I was initially thinking that using multiple limiters might work but I am not quite sure about it. Is there any recommendation
For reference, I am using OpenAI API which has these rate limits (https://platform.openai.com/account/rate-limits)
Beta Was this translation helpful? Give feedback.
All reactions