Skip to content

[Documentation]: Update and Refactor docs #2258

@Aionw

Description

@Aionw

Problem

The Mooncake documentation has grown organically across multiple releases. This has led to several structural issues:

  1. Navigation is flat and unfocused. The sidebar toctree mixes Getting Started guides, performance benchmarks, API references, design documents, and deployment guides into a single flat list. New users struggle to find scenario-specific content; experienced users cannot quickly locate reference material.

  2. vLLM integration docs are fragmented by version, not by task. Users must read three separate pages (v0.2, v0.3, v1.0) to understand a single scenario (e.g., PD disaggregation). There is no unified landing page that presents the available scenarios first and lets users choose the appropriate backend.

  3. API reference is scattered. C++ API docs live inside design documents, Python API docs are in a standalone directory, and HTTP API docs are elsewhere. There is no single entry point for developers looking for API documentation.

  4. Existing pages have accumulated typos and inconsistent formatting. Headings vary in style, admonitions are used inconsistently, and some documents are missing from toctrees entirely, causing Sphinx build warnings.

  5. Integration guides (SGLang, LMCache) and the deployment guide need restructuring to match the improved information architecture used for vLLM.


Proposal

The documentation site is reorganized along three axes: navigation (how users discover content), content structure (how individual pages are organized internally), and discoverability (how cross-cutting concerns like API reference and performance benchmarks are surfaced).

Navigation: from flat list to semantic sections

Before: A single flat toctree sidebar grouped loosely by category — Getting Started, Performance, Python API Reference, Design Documents, Troubleshooting, Deployment. Users had to scan the entire sidebar to find relevant pages.

After: The sidebar is restructured into six semantic sections ordered by user journey:

Section Purpose
Getting Started Quick-start, build guide — the minimum to get running
Deployment Guide Deployment guide plus integration landing pages (vLLM, SGLang, LMCache, LMDeploy)
Performance Per-component benchmark overview pages (vLLM, SGLang, Mooncake Store) summarizing key findings
Developer Guide Architecture, design documents, and component internals
API Reference Unified Python / C++ / HTTP API documentation
Archived Legacy vLLM integration pages preserved for existing deployments

Content structure: from version-fragmented to task-oriented

Before: vLLM integration docs were split into three version-specific pages (v0.2, v0.3, v1.0). Users wanting PD disaggregation had to read across all three to find relevant information. SGLang and LMCache integration pages were similarly scattered.

After: Two new task-oriented guides replace the version-specific pages:

  • disagg-prefill-decode.md — covers PD disaggregation with both V1 (recommended) and V0 (legacy) backends
  • kv-cache-storage.md — covers KV cache storage and sharing with MooncakeStore

Each guide presents the scenario first and lets the reader choose the appropriate backend. Legacy pages are preserved with archive banners for existing deployments.

SGLang follows the same pattern: the index page becomes a HiCache scenario landing page; the PD disaggregation guide is rewritten with clearer prerequisites and updated syntax.

Discoverability: API reference and performance benchmarks

Before: C++ API docs were embedded in design documents, Python API docs had a standalone section, and HTTP API docs were separate. Performance benchmarks were a flat list of documents with no overview.

After:

  • A unified api-reference/ directory with Python, C++, and HTTP sub-indices provides a single entry point for developers.
  • Per-component performance index pages (performance/vllm/, performance/sglang/, performance/mooncake-store/) summarize benchmark findings in overview tables, linking to the full reports.

Quality baseline

  • Typos and inconsistent formatting are fixed across 18+ pages.
  • All documents are registered in toctrees — the build passes make html -W with zero warnings.
  • The build guide is reorganized: the PyPI installation section is removed (duplicating the quick-start page), compile options are converted from inline lists to scannable tables, and long shell commands are line-wrapped.
  • The deployment guide is rewritten with an architecture diagram and restructured reference tables.

Ready for Review

A series of PRs addressing the issues above has been prepared. Since reviewing raw markdown diffs does not give a clear picture of the final navigation and layout, a preview of the fully merged result is deployed at:

https://aionw.github.io/

Before submitting a new issue...

  • Make sure you already searched for relevant issues and read the documentation

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions