Cherry pick changes from main branch to release0.6 branch#1969
Cherry pick changes from main branch to release0.6 branch#1969varungup90 merged 5 commits intorelease-0.6from
Conversation
* add more metrics * pass label from the engine side * chore: add grafana dashboard * refactor the metrics to only expose one public method Signed-off-by: chenyu.jiang <chenyu.jiang@bytedance.com> Co-authored-by: chenyu.jiang <chenyu.jiang@bytedance.com>
Signed-off-by: varungupta <varungup90@gmail.com>
* Feat: Support vllm new kvevent format Signed-off-by: qiupengfei <qiupengfei@baidu.com> Change-Id: I5aafd7fbb42c72e62ba6ac06065a790e46c3e9f9 * Fix: isSamePod is wrong & incr RouterInit timeout Signed-off-by: qiupengfei <qiupengfei@baidu.com> Change-Id: Ia495ac8f352d0e92872b5d1847c4ea6330909c6c * Fix: gemini cr suggestion & unittest Signed-off-by: qiupengfei <qiupengfei@baidu.com> Change-Id: I7bf72cfaa964beb4729db943111db0113ecac184 * Fix: gofmt error Signed-off-by: qiupengfei <qiupengfei@baidu.com> Change-Id: I979001891d1bf506afd3699911e50ad046caf878 * Fix: parseBlockHashToInt64 should handle float64 and float32 Signed-off-by: qiupengfei <qiupengfei@baidu.com> Change-Id: I26a889bba182538bd2692c81301047f8b881b2ea --------- Signed-off-by: qiupengfei <qiupengfei@baidu.com> Co-authored-by: qiupengfei <qiupengfei@baidu.com>
Signed-off-by: chenyu.jiang <chenyu.jiang@bytedance.com>
Signed-off-by: varungupta <varungup90@gmail.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces several key enhancements to the aibrix platform. It focuses on improving model configuration flexibility, enhancing metric collection, ensuring compatibility with future vLLM changes, and refining PD disaggregation logic. These changes collectively contribute to a more robust, efficient, and adaptable system. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request cherry-picks a large set of changes, primarily introducing a new "Model Config and Profiles" feature. This is a significant architectural improvement, moving model-specific configurations from scattered pod labels to a centralized JSON annotation. The implementation is well-documented and integrated across the gateway and routing logic. The PR also includes several other valuable changes, such as a new worker queue for Prometheus queries to improve stability, and a major refactoring of how metrics are emitted. I have identified a couple of issues that need attention: a hardcoded secret in a mock configuration and a potential logic change in the throughput router that should be clarified.
Pull Request Description
Cherry pick changes from main branch to release0.6 branch
Related Issues
Resolves: #[Insert issue number(s)]
Important: Before submitting, please complete the description above and review the checklist below.
Contribution Guidelines (Expand for Details)
We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:
Pull Request Title Format
Your PR title should start with one of these prefixes to indicate the nature of the change:
[Bug]: Corrections to existing functionality[CI]: Changes to build process or CI pipeline[Docs]: Updates or additions to documentation[API]: Modifications to aibrix's API or interface[CLI]: Changes or additions to the Command Line Interface[Misc]: For changes not covered above (use sparingly)Note: For changes spanning multiple categories, use multiple prefixes in order of importance.
Submission Checklist
By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.