Skip to content

Added demo example for vLLM Server and shareGPT datagen component#37

Merged
k8s-ci-robot merged 4 commits intokubernetes-sigs:mainfrom
SachinVarghese:demo
Mar 25, 2025
Merged

Added demo example for vLLM Server and shareGPT datagen component#37
k8s-ci-robot merged 4 commits intokubernetes-sigs:mainfrom
SachinVarghese:demo

Conversation

@SachinVarghese
Copy link
Contributor

@SachinVarghese SachinVarghese commented Mar 22, 2025

This PR adds an example notebook to run a vLLM server example
Fixes #35

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 22, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: SachinVarghese

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 22, 2025
@SachinVarghese SachinVarghese marked this pull request as ready for review March 22, 2025 14:06
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 22, 2025
@SachinVarghese SachinVarghese changed the title Added demo example Added demo example for vLLM Server and shareGPT datagen component Mar 22, 2025
if (
data is None
or data[self.data_key] is None
or len(data[self.data_key]) > self.max_num_turns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we were filtering out conversations with less than 2 turns - looks like this got changed to filtering out conversations > 2 turns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had missed that. Fixed in 5f9b3a9

# Given a rate, yield a time to wait before the next request
while True:
next_time += self._rand.uniform(0, 1 / self._rate)
next_time += self._rand.exponential(1 / self._rate)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't uniform better for constant load timer? Isn't the main difference between this one and the Poisson one that constant load timer sends requests at uniform intervals between them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is the core idea but I was getting incorrect timer results with uniform random function here. Updating to exponentials provide expected results. My recommendation is to use exponential for now and I will revisit this in a separate ticket. As part of a new ticket, I will also add some tests to validate this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for now. But would be good to address in a follow up. From my past experience using this is that it models request rates correctly, but the arrival rate is not uniform within the time interval (second).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a ticket to address this here
Please assign this to me.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 25, 2025
Signed-off-by: Sachin Varghese <sachin.mathew31@gmail.com>
Signed-off-by: Sachin Varghese <sachin.mathew31@gmail.com>
Signed-off-by: Sachin Varghese <sachin.mathew31@gmail.com>
Signed-off-by: Sachin Varghese <sachin.mathew31@gmail.com>
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 25, 2025
@achandrasekar
Copy link
Contributor

Thanks for putting this out! Having a first e2e demo is great!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 25, 2025
@k8s-ci-robot k8s-ci-robot merged commit 8d2b9b1 into kubernetes-sigs:main Mar 25, 2025
4 checks passed
@SachinVarghese SachinVarghese deleted the demo branch March 26, 2025 00:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Demo Example for perf tool

3 participants