[KVPool]Fix PP get bug #5007

baxingpiaochong · 2025-12-15T02:26:01Z

What this PR does / why we need it?

When kv caches are evicted from the key-value pool, it's possible that the kv cache for pp0 is still active, but the kv cache for pp1 has already been evicted. Therefore, a unified check is needed during the get operation.

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.12.0
vLLM main: vllm-project/vllm@ad32e3e

Signed-off-by: baxingpiaochong <[email protected]>

github-actions · 2025-12-15T02:26:08Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request aims to fix a bug in the key-value pool's get operation for pipeline parallelism by introducing a unified check. However, the change in vllm_ascend/distributed/kvpool/pool_worker.py introduces a critical issue. It will cause an IndexError when both tensor and pipeline parallelism are enabled, because the list of keys being checked is not constructed correctly to cover all parallel ranks. This change assumes the key list is larger than it is, leading to an out-of-bounds access. The root cause lies in the key generation logic which needs to be fixed for this change to be effective.

gemini-code-assist · 2025-12-15T02:27:33Z

vllm_ascend/distributed/kvpool/pool_worker.py

+                for i in range(
+                    min(self.tp_size, self.num_kv_head) * self.pp_size)
            ]


This change is likely to cause an IndexError when both tensor parallelism and pipeline parallelism are enabled (i.e., self.tp_size > 1 and self.pp_size > 1).

The res list's size is determined by multi_tp_keys. The construction of multi_tp_keys on lines 554-565 does not generate all combinations of TP and PP ranks. It only generates keys for (tp_rank=any, pp_rank=0) and (tp_rank=0, pp_rank>0), missing cross-combinations.

As a result, len(res) will be num_block * (tp_factor + self.pp_size - 1), where tp_factor = min(self.tp_size, self.num_kv_head). This loop, however, attempts to access up to num_block * tp_factor * self.pp_size elements. An IndexError will occur if tp_factor * self.pp_size > tp_factor + self.pp_size - 1, which is true when tp_factor > 1 and self.pp_size > 1.

For this unified check to work correctly, the multi_tp_keys list must be populated with keys for all combinations of TP and PP ranks. The logic for generating multi_tp_keys needs to be corrected first.

LCAIZJ · 2025-12-15T03:31:18Z

For the PP (Pipeline Parallelism) scenario, consider the query pattern of each PP stage.

fix pp get bug

c1ea484

Signed-off-by: baxingpiaochong <[email protected]>

gemini-code-assist bot reviewed Dec 15, 2025

View reviewed changes

jianzs approved these changes Dec 15, 2025

View reviewed changes

Merge branch 'main' into kv_pool_support_pp

7d2fa8b

jianzs added ready read for review ready-for-test start test by label for PR labels Dec 15, 2025

LCAIZJ approved these changes Dec 15, 2025

View reviewed changes

Merge branch 'main' into kv_pool_support_pp

39ccb00

jianzs merged commit 95e6400 into vllm-project:main Dec 15, 2025
14 of 16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[KVPool]Fix PP get bug #5007

[KVPool]Fix PP get bug #5007

baxingpiaochong commented Dec 15, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Dec 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 15, 2025

Uh oh!

LCAIZJ commented Dec 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[KVPool]Fix PP get bug #5007

[KVPool]Fix PP get bug #5007

Conversation

baxingpiaochong commented Dec 15, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Dec 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

LCAIZJ commented Dec 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

baxingpiaochong commented Dec 15, 2025 •

edited by github-actions bot

Loading