Summary
Integrate Config Explorer's capabilities into the recommendation pipeline to provide estimated recommendations when benchmark data is unavailable for a model/GPU combination.
Approach
- Use
capacity_planner.py to determine which GPU configurations can physically fit a model (memory feasibility filtering)
- Use
gpu_recommender (BentoML roofline model) to generate synthetic performance estimates for configurations lacking benchmark data
- Feed these estimates into the existing scoring/ranking pipeline as a fallback
User flow
- User identifies target GPUs and models
- System checks all models and GPUs against GPU and Model catalogs
- For combinations without benchmark data, use capacity planning + roofline estimation
- Present estimated recommendations alongside benchmark-backed ones
Reference
See migration plan PR #129 Follow-on Step 3 and meeting notes (Mar 30, 2026).