Added demo example for vLLM Server and shareGPT datagen component#37
Added demo example for vLLM Server and shareGPT datagen component#37k8s-ci-robot merged 4 commits intokubernetes-sigs:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: SachinVarghese The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
| if ( | ||
| data is None | ||
| or data[self.data_key] is None | ||
| or len(data[self.data_key]) > self.max_num_turns |
There was a problem hiding this comment.
I believe we were filtering out conversations with less than 2 turns - looks like this got changed to filtering out conversations > 2 turns.
| # Given a rate, yield a time to wait before the next request | ||
| while True: | ||
| next_time += self._rand.uniform(0, 1 / self._rate) | ||
| next_time += self._rand.exponential(1 / self._rate) |
There was a problem hiding this comment.
Isn't uniform better for constant load timer? Isn't the main difference between this one and the Poisson one that constant load timer sends requests at uniform intervals between them?
There was a problem hiding this comment.
Yes that is the core idea but I was getting incorrect timer results with uniform random function here. Updating to exponentials provide expected results. My recommendation is to use exponential for now and I will revisit this in a separate ticket. As part of a new ticket, I will also add some tests to validate this.
There was a problem hiding this comment.
This works for now. But would be good to address in a follow up. From my past experience using this is that it models request rates correctly, but the arrival rate is not uniform within the time interval (second).
There was a problem hiding this comment.
Added a ticket to address this here
Please assign this to me.
Signed-off-by: Sachin Varghese <sachin.mathew31@gmail.com>
Signed-off-by: Sachin Varghese <sachin.mathew31@gmail.com>
Signed-off-by: Sachin Varghese <sachin.mathew31@gmail.com>
Signed-off-by: Sachin Varghese <sachin.mathew31@gmail.com>
|
Thanks for putting this out! Having a first e2e demo is great! /lgtm |
This PR adds an example notebook to run a vLLM server example
Fixes #35