Skip to content

docs: update README to describe Capacity Planner and GPU Recommender#182

Draft
amito wants to merge 2 commits intollm-d-incubation:mainfrom
amito:chore/update-readme
Draft

docs: update README to describe Capacity Planner and GPU Recommender#182
amito wants to merge 2 commits intollm-d-incubation:mainfrom
amito:chore/update-readme

Conversation

@amito
Copy link
Copy Markdown
Collaborator

@amito amito commented Apr 14, 2026

The README previously only described the conversational recommendation engine (originally NeuralNav).
Updated to cover all three capabilities: Capacity Planner (GPU memory estimation), GPU Recommender (roofline performance prediction), and the estimated performance fallback.
Added CLI section, updated feature list, key technologies, and milestone history.

@anfredette anfredette marked this pull request as draft April 14, 2026 22:44
@anfredette anfredette requested review from jgchn and namasl April 14, 2026 22:44
@anfredette
Copy link
Copy Markdown
Collaborator

@amito @jgchn @namasl
I've made some updates, but want to spend more time on this tomorrow.

If we are going to rename Capacity Planner and GPU Recommender, now is probably the time to do it. I've updated the doc to call them "Capacity Analyzer" and "Performance Analyzer", but that's just a suggestion. Please let me know what you think.

@anfredette anfredette self-requested a review April 14, 2026 22:49
Copy link
Copy Markdown
Collaborator

@jgchn jgchn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @amito @anfredette thanks for starting this. I wonder if we can tie it to the LLM aspects more. Thoughts on "LLM Memory Analyzer" and "Inference Performance Analyzer"?

Actually "analyzer" makes me think that this is on benchmarked data rather than on estimated data. Maybe it's okay given that we eventually want just one unified user experience.

Comment thread README.md
- **⚡ One-Click Deployment** - Generate production-ready KServe/vLLM YAML and deploy to Kubernetes
- **📈 Performance Monitoring** - Track actual deployment status and test inference in real-time
- **💻 GPU-Free Development** - vLLM simulator enables local testing without GPU hardware
- **Conversational Requirements Gathering** - Describe your use case in natural language
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- **Conversational Requirements Gathering** - Describe your AI-powered use case in natural language

Comment thread README.md
**Required before running `make setup`:**

- **macOS or Linux** (Windows via WSL2)
- **Docker Desktop** (must be running)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is docker desktop required? I thought just Docker or Podman would work

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker or Podman should do. I'll fix that.

Comment thread README.md
6. **Security Hardening** - YAML validation, RBAC, network policies
7. **Multi-Tenancy** - Namespaces, resource quotas, isolation
8. **Advanced Simulation** - SimPy, Monte Carlo for what-if analysis
1. **Prefill/Decode Disaggregation** - Support P/D disaggregation as a first-class deployment topology
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also add exposing llm-d stack level configuration like routing in addition to PD and finer-grained vLLM params

amito and others added 2 commits April 16, 2026 09:11
The README previously only described the conversational recommendation
engine (originally NeuralNav). Updated to cover all three capabilities:
Capacity Planner (GPU memory estimation), GPU Recommender (roofline
performance prediction), and the estimated performance fallback. Added
CLI section, updated feature list, key technologies, and milestone
history.

Signed-off-by: Amit Oren <amoren@redhat.com>
Rename capabilities to Planner, Capacity Analyzer, and Performance
Analyzer. Rewrite overview to highlight the unified platform story
and how the analyzers both stand alone and feed into the Planner
workflow. Merge redundant feature sections, align future enhancements
with the llm-d-planner proposal, add Linux prereq links, and fix
the contributing section.

Signed-off-by: Andre Fredette <afredette@redhat.com>
@anfredette anfredette force-pushed the chore/update-readme branch from f65c47d to 6dddcb9 Compare April 16, 2026 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants