Open
Description
Hi there :-)
Is there a possibility to configure multiple users / concurrent request sessions?
I'd like to simulate how the different backends behave if not 1 user, but e.g. 8 users concurrently access the LLM.
I know there is the possibility to configure batches, but there should be a performance difference if e.g. 1 user sends a batch with 8 requests or 8 users independently send a batch with 1 request each. Please correct me if that is not true :-)
Thanks a lot and appreciate the work on optimum-benchmark!