Skip to content

Commit 59aa486

Browse files
authored
[NVIDIA#12560][fix] Fix disaggserving hang on block reuse after eviction (NVIDIA#12667)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
1 parent 7aa7818 commit 59aa486

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1688,6 +1688,15 @@ std::pair<SizeType32, std::vector<KVCacheBlock::IdType>> WindowBlockManager::sto
16881688
}
16891689
if (pinBlocks)
16901690
{
1691+
// If the block has no refs it sits in the eviction policy's free
1692+
// queue. Claim it first so that the later unpinBlocksById /
1693+
// releaseBlock cycle does not create a duplicate queue entry.
1694+
// Pass the block's existing priority and duration so that
1695+
// claimBlock does not clear its retention/expiry metadata.
1696+
if (!searchRoot->hasRefs())
1697+
{
1698+
mEvictionPolicy->claimBlock(searchRoot, searchRoot->getPriority(), searchRoot->getDurationMs());
1699+
}
16911700
searchRoot->incRefCount();
16921701
pinnedBlockIds.push_back(searchRoot->getBlockId());
16931702
}

0 commit comments

Comments
 (0)