Skip to content

[LFX] Enhanced cloud-edge-collaborative-inference-for-llm example #188

Merged
kubeedge-bot merged 1 commit intokubeedge:mainfrom
AryanNanda17:lfx_proposal#185_point1
Jun 19, 2025
Merged

[LFX] Enhanced cloud-edge-collaborative-inference-for-llm example #188
kubeedge-bot merged 1 commit intokubeedge:mainfrom
AryanNanda17:lfx_proposal#185_point1

Conversation

@AryanNanda17
Copy link
Copy Markdown
Contributor

@AryanNanda17 AryanNanda17 commented Apr 12, 2025

The example "Cloud-Edge Collaborative Inference for LLM" is well-structured. This PR improves few areas of the example to make it a fully functional quick-start guide with minimal errors.

The changes done does the following:-

  1. Included a Resource-Sensitive Router
  2. Added api_provider, api_base_url and api_key_env parameters to cloudmodel in test_queryrouting.yaml file.
  3. Ease of Setting up the environment
  4. Backward Compatibility
  5. Updating the Threshold for Random Routing
  6. Error handling in this example is improved
  7. Correcting device = “cuda” assumption

Note:- This PR is an implementation of point 1 of #185. This PR comes under LFX Spring Term 2025 project "Enhancing Dependency Management and Documentation of ianvs".

Fixes #178

@kubeedge-bot kubeedge-bot requested review from Poorunga and hsj576 April 12, 2025 20:25
@kubeedge-bot kubeedge-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 12, 2025
@AryanNanda17 AryanNanda17 changed the title Lfx proposal#185 point1 implementation LFX proposal#185 Point1 Implementation Apr 12, 2025
@kubeedge-bot kubeedge-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 13, 2025
@AryanNanda17 AryanNanda17 force-pushed the lfx_proposal#185_point1 branch from ab03606 to c0767d8 Compare April 14, 2025 18:35
@AryanNanda17
Copy link
Copy Markdown
Contributor Author

/kind enhancement

@kubeedge-bot
Copy link
Copy Markdown
Collaborator

@AryanNanda17: The label(s) kind/enhancement cannot be applied, because the repository doesn't have them

Details

In response to this:

/kind enhancement

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Comment thread examples/cloud-edge-collaborative-inference-for-llm/requirements.txt Outdated
@AryanNanda17 AryanNanda17 force-pushed the lfx_proposal#185_point1 branch from c0767d8 to fe22b2e Compare April 18, 2025 04:15
@AryanNanda17
Copy link
Copy Markdown
Contributor Author

@FuryMartin,

Since groq doesn't support the usage options, I have done an approximation of prompt_tokens and completion_tokens :

                prompt_tokens = len(messages[0]['content'].split())  # Approximate
                completion_tokens = len(text.split())  # Approximate

For more accuracy we can use a tokenizer compatible with the model to estimate token counts accurately. A common approach is to use the tiktoken library for models compatible with OpenAI’s tokenization.

That is:-

self.tokenizer = tiktoken.get_encoding("cl100k_base")
prompt_text = "".join([msg["content"] for msg in messages if msg["content"]])
prompt_tokens = len(self.tokenizer.encode(prompt_text))
completion_tokens = len(self.tokenizer.encode(text))

What would you recommend to go with?

@AryanNanda17 AryanNanda17 requested a review from FuryMartin April 18, 2025 09:45
@AryanNanda17 AryanNanda17 mentioned this pull request Apr 21, 2025
25 tasks
Copy link
Copy Markdown
Collaborator

@MooreZheng MooreZheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to see that all suggestions from other reviewers are properly tackled.

Just some tiny logistics consideration:

  1. It aims to build the cloud-edge-collaborative-inference-for-llm example. The purpose can be specified in the title and description to make it clear to community members. We also need to squash the commits into one.
  2. The CI test for this PR is not yet past, and might not be wrong for this PR. But we should have the CI pass before merging this PR.
  3. It would be appreciated to update the dataset link, as specified below.

Comment thread examples/cloud-edge-collaborative-inference-for-llm/Dockerfile Outdated
@AryanNanda17 AryanNanda17 changed the title LFX proposal#185 Point1 Implementation Improves cloud-edge-collaborative-inference-for-llm example to make it a fully functional quick-start guide with minimal errors. Apr 24, 2025
@AryanNanda17 AryanNanda17 changed the title Improves cloud-edge-collaborative-inference-for-llm example to make it a fully functional quick-start guide with minimal errors. Improves cloud-edge-collaborative-inference-for-llm example to make it a fully functional quick-start guide with minimal errors Apr 24, 2025
@AryanNanda17 AryanNanda17 force-pushed the lfx_proposal#185_point1 branch from ef20a47 to f4e8c9f Compare April 24, 2025 09:41
@AryanNanda17
Copy link
Copy Markdown
Contributor Author

AryanNanda17 commented Apr 24, 2025

To do:-

  • Update the dataset link

@AryanNanda17
Copy link
Copy Markdown
Contributor Author

@FuryMartin , could you please share with me the updated dataset link?

@AryanNanda17 AryanNanda17 requested a review from MooreZheng April 30, 2025 07:10
@AryanNanda17 AryanNanda17 force-pushed the lfx_proposal#185_point1 branch from 90d61cb to 284c2f6 Compare April 30, 2025 07:50
@AryanNanda17 AryanNanda17 changed the title Improves cloud-edge-collaborative-inference-for-llm example to make it a fully functional quick-start guide with minimal errors [LFX] Enhancement of cloud-edge-collaborative-inference-for-llm example to make it a fully functional quick-start guide with minimal errors May 13, 2025
@AryanNanda17 AryanNanda17 changed the title [LFX] Enhancement of cloud-edge-collaborative-inference-for-llm example to make it a fully functional quick-start guide with minimal errors [LFX] Enhancement of cloud-edge-collaborative-inference-for-llm example May 13, 2025
@AryanNanda17 AryanNanda17 changed the title [LFX] Enhancement of cloud-edge-collaborative-inference-for-llm example [LFX] Enhanced cloud-edge-collaborative-inference-for-llm example May 13, 2025
@AryanNanda17 AryanNanda17 force-pushed the lfx_proposal#185_point1 branch 2 times, most recently from 089f6f1 to 4f7d591 Compare May 31, 2025 09:33
@AryanNanda17 AryanNanda17 force-pushed the lfx_proposal#185_point1 branch 2 times, most recently from ba8de50 to 51cca08 Compare May 31, 2025 09:56
@AryanNanda17
Copy link
Copy Markdown
Contributor Author

The CI workflow failure is described in #212 and has been corrected in #213.

Copy link
Copy Markdown
Contributor

@FuryMartin FuryMartin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work

@kubeedge-bot
Copy link
Copy Markdown
Collaborator

@FuryMartin: changing LGTM is restricted to collaborators

Details

In response to this:

Nice work

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@AryanNanda17 AryanNanda17 force-pushed the lfx_proposal#185_point1 branch from 51cca08 to 2a1d15f Compare June 5, 2025 08:52
dataset instructions changed from hugging face to kaggle

matplotlib added to requirements.txt

dataset instructions changed from hugging face to kaggle

dataset instructions changed from hugging face to kaggle

dataset instructions changed from hugging face to kaggle

Signed-off-by: Aryan Nanda <nandaaryan823@gmail.com>

changes in readme of cloud-edge-collaborative-inference done to use kaggle instead of huggingface

Signed-off-by: Aryan <nandaaryan823@gmail.com>

readme file updated

Signed-off-by: Aryan <nandaaryan823@gmail.com>

print changed to logger

Signed-off-by: Aryan <nandaaryan823@gmail.com>
@AryanNanda17 AryanNanda17 force-pushed the lfx_proposal#185_point1 branch from 2a1d15f to 46340c8 Compare June 5, 2025 09:13
@AryanNanda17
Copy link
Copy Markdown
Contributor Author

@MooreZheng @FuryMartin @hsj576
I have resolved all the comments. This PR is good to go as well.
Thanks

@kubeedge-bot kubeedge-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 5, 2025
Copy link
Copy Markdown
Collaborator

@MooreZheng MooreZheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Copy Markdown
Member

@hsj576 hsj576 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Copy Markdown
Collaborator

@MooreZheng MooreZheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@kubeedge-bot
Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: FuryMartin, hsj576, MooreZheng

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubeedge-bot kubeedge-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 19, 2025
@kubeedge-bot kubeedge-bot merged commit 3b13981 into kubeedge:main Jun 19, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhance Dependency Management and Documentation for Ianvs

5 participants