Describe the solution you'd like
When we use rate limit for generating response using the max_calls_per_min parameter, it'd great if scoring can happen in parallel while waiting for generating more responses.
Describe alternatives you've considered
Processing in batches seems better than using the inbuilt UQLM rate limiter.
Describe the solution you'd like
When we use rate limit for generating response using the max_calls_per_min parameter, it'd great if scoring can happen in parallel while waiting for generating more responses.
Describe alternatives you've considered
Processing in batches seems better than using the inbuilt UQLM rate limiter.