Replies: 11 comments 2 replies
-
|
This issue is important and can be assigned to multiple forks concurrently, if anyone is interested, please comment |
Beta Was this translation helpful? Give feedback.
-
|
/assign |
Beta Was this translation helpful? Give feedback.
-
|
/assign I can handle part of this. |
Beta Was this translation helpful? Give feedback.
-
|
/assign |
Beta Was this translation helpful? Give feedback.
-
|
/assign |
Beta Was this translation helpful? Give feedback.
-
|
/assign |
Beta Was this translation helpful? Give feedback.
-
|
/assign |
Beta Was this translation helpful? Give feedback.
-
|
Let's break it down into sub tasks. @samzong @yike21 @Xunzhuo @R3hankhan123 @madhav-murali @haowu1234 @szedan-rh Please make a one line description of what you can help and which item you can do in a timely fashion. |
Beta Was this translation helpful? Give feedback.
-
|
@rootfs maybe we can turn this into a discussion from an issue? My initial idea is that every one can share how they build the mixture of models and see if there are any magics😉 |
Beta Was this translation helpful? Give feedback.
-
|
Also want to relate the recipes for verification process through public beta. Would be great recipe proposals can be encapsulated in a the operational process that #1030 can provision for verification. |
Beta Was this translation helpful? Give feedback.
-
|
/assign |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Description
Create a comprehensive Mixture-of-Models (MoM) Recipes system that provides pre-configured, tested, and production-ready multi-model routing strategies for different optimization goals (accuracy, latency, cost).
A MoM Recipe is a complete configuration package that combines:
Context
Problem
Solution
Create a curated library of MoM Recipes - pre-configured, benchmarked, and documented routing strategies that users can:
vllm-sr serve --config config/mom-recipes/accuracy-optimized/config.yamlReference
Recipe Structure
Each recipe should be a complete package under
config/mom-recipes/{recipe-name}/:Recipe Metadata (in README.md)
Each recipe must document:
0. Scenario & Optimization Goal
1. Models Used
2. Evaluation Results
3. Signal Design
4. Decision Logic & Plugins
5. Runtime Configuration
config.yamlreferenceUsage Pattern
Users should be able to use recipes via
vllm-srCLI:Initial Recipe Set
Priority 1: Core Recipes
Priority 2: Specialized Recipes
4. Balanced: Trade-off between accuracy, latency, and cost
5. Research-Focused: Maximum reasoning capability with extended thinking
6. Production-Scale: High-throughput with load balancing and failover
Evaluation Integration
Recipes must be validated using existing evaluation framework or even public model arena, to evaluate if it is a good MoM model.
Documentation Updates
Update website documentation:
website/sidebars.ts)website/docs/recipes/overview.mdwebsite/docs/recipes/{recipe-name}.mdAcceptance Criteria
config/mom-recipes/config.yamlREADME.mdwith all required sectionsvllm-sr serve --configcommandFuture Enhancements
Beta Was this translation helpful? Give feedback.
All reactions