docs: update README to describe Capacity Planner and GPU Recommender#182
docs: update README to describe Capacity Planner and GPU Recommender#182amito wants to merge 2 commits intollm-d-incubation:mainfrom
Conversation
|
@amito @jgchn @namasl If we are going to rename Capacity Planner and GPU Recommender, now is probably the time to do it. I've updated the doc to call them "Capacity Analyzer" and "Performance Analyzer", but that's just a suggestion. Please let me know what you think. |
There was a problem hiding this comment.
Hi @amito @anfredette thanks for starting this. I wonder if we can tie it to the LLM aspects more. Thoughts on "LLM Memory Analyzer" and "Inference Performance Analyzer"?
Actually "analyzer" makes me think that this is on benchmarked data rather than on estimated data. Maybe it's okay given that we eventually want just one unified user experience.
| - **⚡ One-Click Deployment** - Generate production-ready KServe/vLLM YAML and deploy to Kubernetes | ||
| - **📈 Performance Monitoring** - Track actual deployment status and test inference in real-time | ||
| - **💻 GPU-Free Development** - vLLM simulator enables local testing without GPU hardware | ||
| - **Conversational Requirements Gathering** - Describe your use case in natural language |
There was a problem hiding this comment.
- **Conversational Requirements Gathering** - Describe your AI-powered use case in natural language
| **Required before running `make setup`:** | ||
|
|
||
| - **macOS or Linux** (Windows via WSL2) | ||
| - **Docker Desktop** (must be running) |
There was a problem hiding this comment.
Is docker desktop required? I thought just Docker or Podman would work
There was a problem hiding this comment.
Docker or Podman should do. I'll fix that.
| 6. **Security Hardening** - YAML validation, RBAC, network policies | ||
| 7. **Multi-Tenancy** - Namespaces, resource quotas, isolation | ||
| 8. **Advanced Simulation** - SimPy, Monte Carlo for what-if analysis | ||
| 1. **Prefill/Decode Disaggregation** - Support P/D disaggregation as a first-class deployment topology |
There was a problem hiding this comment.
I would also add exposing llm-d stack level configuration like routing in addition to PD and finer-grained vLLM params
The README previously only described the conversational recommendation engine (originally NeuralNav). Updated to cover all three capabilities: Capacity Planner (GPU memory estimation), GPU Recommender (roofline performance prediction), and the estimated performance fallback. Added CLI section, updated feature list, key technologies, and milestone history. Signed-off-by: Amit Oren <amoren@redhat.com>
Rename capabilities to Planner, Capacity Analyzer, and Performance Analyzer. Rewrite overview to highlight the unified platform story and how the analyzers both stand alone and feed into the Planner workflow. Merge redundant feature sections, align future enhancements with the llm-d-planner proposal, add Linux prereq links, and fix the contributing section. Signed-off-by: Andre Fredette <afredette@redhat.com>
f65c47d to
6dddcb9
Compare
The README previously only described the conversational recommendation engine (originally NeuralNav).
Updated to cover all three capabilities: Capacity Planner (GPU memory estimation), GPU Recommender (roofline performance prediction), and the estimated performance fallback.
Added CLI section, updated feature list, key technologies, and milestone history.